System and method for the utilization of pricing models in the aggregation, analysis, presentation and monetization of pricing data for vehicles and other commodities

ABSTRACT

Embodiments of systems and methods for the aggregation, analysis, display and monetization of pricing data for commodities in general, and which may be particularly useful applied to vehicles are disclosed. In certain embodiments, one or more models may be applied over a set of historical transaction data associated with a vehicle configuration to determine pricing data. Some models may leverage incremental data in various conditions, including cases where fewer than a desired number of historical transactions are present in the bin of a specified vehicle, where fewer than, equal to, or more than a certain number of list prices for the specified vehicle available, and where no historical transaction data for new models is available.

RELATED APPLICATIONS

This application is a continuation-in-part of U.S. patent applicationSer. No. 12/556,109, filed Sep. 9, 2009, entitled “SYSTEM AND METHOD FORCALCULATING AND DISPLAYING PRICE DISTRIBUTIONS BASED ON ANALYSIS OFTRANSACTIONS,” which claims priority from Provisional PatentApplications No. 61/095,550, filed Sep. 9, 2008, entitled “SYSTEM ANDMETHOD FOR AGGREGATION, ANALYSIS, AND MONETIZATION OF PRICINGDISTRIBUTION DATA FOR VEHICLES AND OTHER COMMODITIES” and No.61/095,376, filed Sep. 9, 2008, entitled “SYSTEM AND METHOD FORCALCULATING AND DISPLAYING COMPLEX PRODUCT PRICE DISTRIBUTIONS BASED ONAGGREGATION AND ANALYSIS OF INDIVIDUAL TRANSACTIONS.” This applicationalso claims priority from U.S. Provisional Patent Application No.61/248,083, filed Oct. 2, 2009, entitled “SYSTEM AND METHOD FOR THEUTILIZATION OF PRICING MODELS IN THE AGGREGATION, ANALYSIS, PRESENTATIONAND MONETIZATION OF PRICING DATA FOR VEHICLES AND OTHER COMMODITIES.”All applications referenced herein are hereby fully incorporated for allpurposes.

TECHNICAL FIELD

The present disclosure relates generally to commodity pricing. Moreparticularly, the present disclosure relates to the aggregation,analysis and presentation of data pertaining to a commodity. Even morespecifically, the present disclosure relates to the use of embodimentsof pricing models in different contexts during the analysis this data.

BACKGROUND

Consumers are at a serious negotiation disadvantage when they do nothave information relevant to a specifically desired product or do notunderstand such information. Exacerbating this problem is the fact thatcomplex, negotiated transactions can be difficult for consumers tounderstand due to a variety of factors, including interdependencebetween local demand and availability of products or product features,the point-in-time in the product lifecycle at which a transactionoccurs, and the interrelationships of various transactions to oneanother. For example, a seller may sacrifice margin on one aspect of onetransaction and recoup that margin from another transaction with thesame (or a different) customer. Furthermore, currently available datafor complex transactions is single dimensional. To illustrate with aspecific example, a recommended price (e.g. $1,000) may not take intoaccount how sensitive that price is (is $990 a good or bad price)?Recommended prices also become decreasingly accurate as the product,location, and availability of a particular product is defined withgreater specificity.

These circumstances can be seen in a variety of contexts. In particular,the automotive transaction process may entail complexity of this type.Specifically, the price a consumer pays may depend on the vehicle, thedealership, historical patterns, anticipated sales patterns, promotionprograms, the customer's and dealer's emotions on a particular day, thetime of the day, the day of the month, and the dynamics of thenegotiation itself, and so on. Often times, neither the consumers northe dealers can fully understand what a good or great price is for acertain vehicle having a particular combination of make, model, trimcombinations or packages, etc. Additionally, even though new vehiclesare commodities, transparent pricing information resources for consumerssimply do not exist. Some dealers attempt to optimize or maximizepricing from each individual customer through the negotiation processwhich inevitably occurs with customers in the setting of an automotivevehicle purchase.

There are therefore a number of unmet desires when it comes toobtaining, analyzing and presenting vehicle pricing data.

SUMMARY

Embodiments of systems and methods for the aggregation, analysis,display and monetization of pricing data for commodities in general, andwhich may be particularly useful applied to vehicles, is disclosed. Inparticular, in certain embodiments, historical transaction data may beaggregated into data sets and the data sets processed to determinepricing data, where this determined pricing data may be associated witha particular configuration of a vehicle.

In one embodiment, the processing may comprise applying one or moremodels to a set of historical transaction data associated with a vehicleconfiguration to determine pricing data including, for example, anaverage price, a set of transaction prices and one or more price rangesincluding a good price range and a great price range.

Using the historical transactions associated with a user-specifiedvehicle configuration, desired pricing information may be obtained. Inmany cases, however, there may be fewer historical transactions in a binfor a specified vehicle than is desired to generate reliable or accuratepredictions. Such data scarcity may occur in the cases of a model whichis relatively new to the market or is an exotic of which few models aresold. To increase the accuracy of determined pricing data, it may beuseful to apply different or additional price ratio models that leverageincremental data.

Embodiments of the present invention may determine a set of price modelsto utilize in various conditions and utilize appropriate models in thecases where such conditions are extant. As these price models may beapplicable in instances where data is scarce, they may be referred toherein as data scarcity models. In one embodiment, one or more pricemodels may be generated for use in cases where fewer than a desirednumber of historical transactions are present in the bin of a specifiedvehicle. In one embodiment, one or more price models may be generatedfor cases where there are fewer than, equal to, or more than a certainnumber of list prices for the specified vehicle available. In oneembodiment, one or more price models may pertain to new vehicle models,where certain of these price models may be determined for cases wherethere is historical transaction data for a similar make and model from apast year and other price models determined for new models in caseswhere there is no such historical transaction data. In some embodiments,in addition to the general price model, one or more data scarcity modelsmay be utilized by the vehicle data system to determine and presentpricing data for the specified vehicle.

These, and other, aspects of the invention will be better appreciatedand understood when considered in conjunction with the followingdescription and the accompanying drawings. The following description,while indicating various embodiments of the invention and numerousspecific details thereof, is given by way of illustration and not oflimitation. Many substitutions, modifications, additions orrearrangements may be made within the scope of the invention, and theinvention includes all such substitutions, modifications, additions orrearrangements.

BRIEF DESCRIPTION OF THE DRAWINGS

The drawings accompanying and forming part of this specification areincluded to depict certain aspects of the invention. A clearerimpression of the invention, and of the components and operation ofsystems provided with the invention, will become more readily apparentby referring to the exemplary, and therefore nonlimiting, embodimentsillustrated in the drawings, wherein identical reference numeralsdesignate the same components. Note that the features illustrated in thedrawings are not necessarily drawn to scale.

FIG. 1 depicts of one embodiment of a topology including a vehicle datasystem.

FIGS. 2A and 2B depict one embodiment of a method for determining andpresenting pricing data.

FIG. 3 depicts one embodiment of an architecture for a vehicle datasystem.

FIGS. 4A and 4B depict one embodiment of a method for determining andpresenting pricing data.

FIG. 5 depicts one embodiment for a method for determining andpresenting pricing data.

FIG. 6 depicts a distribution associated with the determination of anequation.

FIGS. 7A and 7B depict embodiments of interfaces for the presentation ofpricing data.

FIGS. 8A and 8B depict embodiments of interfaces for the presentation ofpricing data.

FIGS. 9A-9D depict embodiments of interfaces for obtaining vehicleconfiguration information and the presentation of pricing data.

FIGS. 10A-14 graphically depict the creation of pricing data.

FIGS. 15-18 depict embodiments of interfaces for the presentation ofpricing data.

FIG. 19 depicts one embodiment of a method for determining dealer cost.

FIG. 20 depicts one embodiment of a method for determining andpresenting pricing data.

FIG. 21 depicts one embodiment of determining a price model.

FIG. 22 depicts one embodiment of determining a price model.

FIG. 23 depicts one embodiment of determining a price model.

FIG. 24 depicts one embodiment of determining a price model.

FIG. 25 depicts one embodiment for a method of operating a vehicle datasystem which employs multiple price models.

DETAILED DESCRIPTION

The invention and the various features and advantageous details thereofare explained more fully with reference to the nonlimiting embodimentsthat are illustrated in the accompanying drawings and detailed in thefollowing description. Descriptions of well known starting materials,processing techniques, components and equipment are omitted so as not tounnecessarily obscure the invention in detail. It should be understood,however, that the detailed description and the specific examples, whileindicating preferred embodiments of the invention, are given by way ofillustration only and not by way of limitation. Various substitutions,modifications, additions and/or rearrangements within the spirit and/orscope of the underlying inventive concept will become apparent to thoseskilled in the art from this disclosure. Embodiments discussed hereincan be implemented in suitable computer-executable instructions that mayreside on a computer readable medium (e.g., a HD), hardware circuitry orthe like, or any combination.

Before discussing specific embodiments, embodiments of a hardwarearchitecture for implementing certain embodiments is described herein.One embodiment can include one or more computers communicatively coupledto a network. As is known to those skilled in the art, the computer caninclude a central processing unit (“CPU”), at least one read-only memory(“ROM”), at least one random access memory (“RAM”), at least one harddrive (“HD”), and one or more input/output (“I/O”) device(s). The I/Odevices can include a keyboard, monitor, printer, electronic pointingdevice (such as a mouse, trackball, stylist, etc.), or the like. Invarious embodiments, the computer has access to at least one databaseover the network.

ROM, RAM, and HD are computer memories for storing computer instructionsexecutable (in other which can be directly executed or made executableby, for example, compilation, translation, etc.) by the CPU. Within thisdisclosure, the term “computer-readable medium” is not limited to ROM,RAM, and HD and can include any type of data storage medium that can beread by a processor. In some embodiments, a computer-readable medium mayrefer to a data cartridge, a data backup magnetic tape, a floppydiskette, a flash memory drive, an optical data storage drive, a CD-ROM,ROM, RAM, HD, or the like.

At least portions of the functionalities or processes described hereincan be implemented in suitable computer-executable instructions. Thecomputer-executable instructions may be stored as software codecomponents or modules on one or more computer readable media (such asnon-volatile memories, volatile memories, DASD arrays, magnetic tapes,floppy diskettes, hard drives, optical storage devices, etc. or anyother appropriate computer-readable medium or storage device). In oneembodiment, the computer-executable instructions may include lines ofcomplied C++, Java, HTML, or any other programming or scripting code.

Additionally, the functions of the disclosed embodiments may beimplemented on one computer or shared/distributed among two or morecomputers in or across a network. Communications between computersimplementing embodiments can be accomplished using any electronic,optical, radio frequency signals, or other suitable methods and tools ofcommunication in compliance with known network protocols.

As used herein, the terms “comprises,” “comprising,” “includes,”“including,” “has,” “having” or any other variation thereof, areintended to cover a non-exclusive inclusion. For example, a process,process, article, or apparatus that comprises a list of elements is notnecessarily limited only those elements but may include other elementsnot expressly listed or inherent to such process, process, article, orapparatus. Further, unless expressly stated to the contrary, “or” refersto an inclusive or and not to an exclusive or. For example, a conditionA or B is satisfied by any one of the following: A is true (or present)and B is false (or not present), A is false (or not present) and B istrue (or present), and both A and B are true (or present).

Additionally, any examples or illustrations given herein are not to beregarded in any way as restrictions on, limits to, or expressdefinitions of, any term or terms with which they are utilized. Instead,these examples or illustrations are to be regarded as being describedwith respect to one particular embodiment and as illustrative only.Those of ordinary skill in the art will appreciate that any term orterms with which these examples or illustrations are utilized willencompass other embodiments which may or may not be given therewith orelsewhere in the specification and all such embodiments are intended tobe included within the scope of that term or terms. Language designatingsuch nonlimiting examples and illustrations includes, but is not limitedto: “for example,” “for instance,” “e.g.,” “in one embodiment.”

The invention and the various features and advantageous details thereofare explained more fully with reference to the non-limiting embodimentsthat are illustrated in the accompanying drawings and detailed in thefollowing description. These embodiments may be better understood withreference to U.S. patent application Ser. No. 12/556,076, entitled“SYSTEM AND METHOD FOR AGGREGATION, ANALYSIS, PRESENTATION ANDMONETIZATION OF PRICING DATA FOR VEHICLES AND OTHER COMMODITIES” byTaira et al., U.S. patent application Ser. No. 12/556,109, entitled“SYSTEM AND METHOD FOR CALCULATING AND DISPLAYING PRICE DISTRIBUTIONSBASED ON ANALYSIS OF TRANSACTIONS” by Taira et al., and U.S. patentapplication Ser. No. 12/556,137, entitled “SYSTEM AND METHOD FOR SALESGENERATION IN CONJUNCTION WITH A VEHICLE DATA SYSTEM” by Inghelbrecht etal., all of which were filed on Sep. 9, 2009 and are fully incorporatedby reference herein. Descriptions of well known starting materials,processing techniques, components and equipment are omitted so as not tounnecessarily obscure the invention in detail. It should be understood,however, that the detailed description and the specific examples, whileindicating preferred embodiments of the invention, are given by way ofillustration only and not by way of limitation. Various substitutions,modifications, additions and/or rearrangements within the spirit and/orscope of the underlying inventive concept will become apparent to thoseskilled in the art from this disclosure. For example, though embodimentsof the present invention have been presented using the example commodityof vehicles it should be understood that other embodiments may beequally effectively applied to other commodities.

As discussed above, complex, negotiated transactions can be difficultfor consumers to understand due to a variety of factors, especially inthe context of a vehicle purchases. In particular, the historical lackof transparency around vehicle pricing still exists in the automotiveindustry, resulting in cases where different consumers can go to thesame dealership on the same day and pay substantially different pricesfor the exact same vehicle sold by the same salesperson.

To remedy this lack of availability of pricing information a variety ofsolutions have been unsuccessfully attempted. In the mid 1990s,companies such as Autobytel (www.autobytel.com) launched websitesfocused on enabling consumer's access to manufacturer's new car pricinginformation. Soon after, Kelley Blue Book (www.KBB.com) launched its ownwebsites that enabled consumers to determine approximate “trade invalues” and “retail values” of used cars.

In 1998, CarsDirect developed its own interpretation of what “consumersshould pay” for a vehicle by launching its upfront pricing tools.CarsDirect's upfront price is a published figure a consumer couldactually purchase a vehicle for through CarsDirect's auto brokeringservice. This price subsequently became the consumer benchmark fornegotiating with dealers in their area.

In 2000, Edmunds (www.edmunds.com) launched a pricing product calledTrue Market Value (TMV), which is marked on their website as“calculating what others are paying for new and used vehicles, based onreal sales data from your geographic area.” This vague language enablesEdmunds to represent their data to their customer as accurate while thedata may only by what they believe the typical buyer is paying for aspecific vehicle within a pre-determined region. Although notnecessarily accurate, TMV has become the most widely recognized new carpricing “average” in the market place.

In 2005, Zag (Zag.com) launched an affinity auto buying program thatenabled consumers to purchase upfront pricing from its network ofnationwide dealer partners. Partner dealers are required to input low,“fleet” level pricing in Zag's pricing management system. These pricesare displayed to the consumer and are measured against Kelley BlueBook's New Car Blue Book Value (which is similar to Edmunds' TMV) andthese prices are defined by Zag as “what people are really paying for avehicle.”

Problematically, current consumer vehicle pricing resources, includingKBB.com, Edmunds.com and various blogs and research sites, allow for theconfiguration of a particular vehicle but only present a singlerecommended price for the vehicle, no matter the specifiedconfiguration. Due to a variety of circumstances (including the lack oftransparency of how the recommend price was determined, whether and howany actual data was used to determine the recommended price or how suchdata was obtained) there is no indication of where the recommended pricesits relative to prices others paid and whether the recommended price isa good price, a great price, etc. (either relative to other prices, orin an absolute sense). Additionally, many of the existing pricing sitesare “lead generation” sites, meaning that they generate revenue byreferring consumers to dealers without requiring dealers to commit to aspecific price, inherently making these types of sites biased in favorof dealers when presenting pricing to consumers. Moreover, these pricingrecommendation sites may not utilize actual sales transaction data, butinstead be estimates calculated manually based on aggregated ormanipulated data.

Accordingly, a myriad number of problems exist with current approachesto pricing solutions for vehicles and other commodities. One suchproblem is that a consumer may not have any context with which tointerpret a price obtained from a vehicle pricing resource andtherefore, a consumer may have little idea what is a good price, a greatprice, an average price, etc., nor will they know what the dealer'sactual cost is for a desired vehicle. This confusion may be exacerbatedgiven the number of variables which may have a bearing on thatparticular consumer's transaction, including the particular locale wherethe consumer intends to purchase the vehicle or the specificconfiguration of vehicle desired by the consumer. Consequently, theconsumer may not be convinced that a price provided by a pricing site isparticularly relevant to their situation or goals and may therefore onlybe able to use such a provided price as a baseline.

There are therefore a number of unmet desires when it comes to obtainingnew or used vehicle pricing. These desires include the ability to useactual sales transaction data in the calculation of prices forparticular vehicles and account for variations in the configuration ofvehicles and the geography in which the vehicle will be purchased.Furthermore, it may be desired that such pricing data is analyzed anddisplayed in such a manner that a holistic view of pertinent salestransaction data can be presented to allow the distribution of pertinentsales data and the various ranges of prices to be easily ascertained anda determination of a certain price levels easily made.

To meet those needs among others, attention is now directed to theaggregation, analysis, display and monetization of pricing data forcommodities in general, and which may be particularly useful applied tovehicles. In particular, actual sales transaction data may be obtainedfrom a variety of sources. This historical transaction data may beaggregated into data sets and the data sets processed to determinedesired pricing data, where this determined pricing data may beassociated with a particular configuration (e.g. make, model, powertrain, options, etc.) of a vehicle. An interface may be presented to auser where a user may provide relevant information such as attributes ofa desired vehicle configuration, a geographic area, etc. The user canthen be presented with a display pertinent to the provided informationutilizing the aggregated data set or the associated determined pricingdata where the user can make a variety of determinations such as a meanprice, dealer cost or factory invoice for a desired vehicle, pricingdistributions, etc. based on the provided display. In one embodiment,this interface may be a website such that the user can go to the websiteto provide relevant information and the display corresponding to theprovided information is presented to the user through the website.

Embodiments of the systems and methods of the present invention may bebetter explained with reference to FIG. 1 which depicts one embodimentof a topology which may be used to implement embodiments of the systemsand methods of the present invention. Topology 100 comprises a set ofentities including vehicle data system 120 (also referred to herein asthe TrueCar system) which is coupled through network 170 to computingdevices 110 (e.g. computer systems, personal data assistants, kiosks,dedicated terminals, mobile telephones, smart phones, etc,), and one ormore computing devices at inventory companies 140, original equipmentmanufacturers (OEM) 150, sales data companies 160, financialinstitutions 182, external information sources 184, departments of motorvehicles (DMV) 180 and one or more associated point of sale locations,in this embodiment, car dealers 130. Network 170 may be for example, awireless or wireline communication network such as the Internet or widearea network (WAN), publicly switched telephone network (PTSN) or anyother type of electronic or non-electronic communication link such asmail, courier services or the like.

Vehicle data system 120 may comprise one or more computer systems withcentral processing units executing instructions embodied on one or morecomputer readable media where the instructions are configured to performat least some of the functionality associated with embodiments of thepresent invention. These applications may include a vehicle dataapplication 190 comprising one or more applications (instructionsembodied on a computer readable media) configured to implement aninterface module 192, data gathering module 194 and processing module196 utilized by the vehicle data system 120. Furthermore, vehicle datasystem 120 may include data store 122 operable to store obtained data124, data 126 determined during operation, models 128 which may comprisea set of dealer cost model or price ratio models, or any other type ofdata associated with embodiments of the present invention or determinedduring the implementation of those embodiments.

Vehicle data system 120 may provide a wide degree of functionalityincluding utilizing one or more interfaces 192 configured to forexample, receive and respond to queries from users at computing devices110; interface with inventory companies 140, manufacturers 150, salesdata companies 160, financial institutions 170, DMVs 180 or dealers 130to obtain data; or provide data obtained, or determined, by vehicle datasystem 120 to any of inventory companies 140, manufacturers 150, salesdata companies 160, financial institutions 182, DMVs 180, external datasources 184 or dealers 130. It will be understood that the particularinterface 192 utilized in a given context may depend on thefunctionality being implemented by vehicle data system 120, the type ofnetwork 170 utilized to communicate with any particular entity, the typeof data to be obtained or presented, the time interval at which data isobtained from the entities, the types of systems utilized at the variousentities, etc. Thus, these interfaces may include, for example webpages, web services, a data entry or database application to which datacan be entered or otherwise accessed by an operator, or almost any othertype of interface which it is desired to utilize in a particularcontext.

In general, then, using these interfaces 192 vehicle data system 120 mayobtain data from a variety of sources, including one or more ofinventory companies 140, manufacturers 150, sales data companies 160,financial institutions 182, DMVs 180, external data sources 184 ordealers 130 and store such data in data store 122. This data may be thengrouped, analyzed or otherwise processed by vehicle data system 120 todetermine desired data 126 or models 128 which are also stored in datastore 122. A user at computing device 110 may access the vehicle datasystem 120 through the provided interfaces 192 and specify certainparameters, such as a desired vehicle configuration or incentive datathe user wishes to apply, if any. The vehicle data system 120 can selecta particular set of data in the data store 122 based on the userspecified parameters, process the set of data using processing module196 and models 128, generate interfaces using interface module 192 usingthe selected data set and data determined from the processing, andpresent these interfaces to the user at the user's computing device 110.More specifically, in one embodiment interfaces 192 may visually presentthe selected data set to the user in a highly intuitive and usefulmanner.

In particular, in one embodiment, a visual interface may present atleast a portion of the selected data set as a price curve, bar chart,histogram, etc. that reflects quantifiable prices or price ranges (e.g.“average,” “good,” “great,” “overpriced” etc.) relative to referencepricing data points (e.g., invoice price, MSRP, dealer cost, marketaverage, internet average, etc.). Using these types of visualpresentations may enable a user to better understand the pricing datarelated to a specific vehicle configuration. Additionally, by presentingdata corresponding to different vehicle configurations in asubstantially identical manner, a user can easily make comparisonsbetween pricing data associated with different vehicle configurations.To further aid the user's understanding of the presented data, theinterface may also present data related to incentives which wereutilized to determine the presented data or how such incentives wereapplied to determine presented data.

Turning to the various other entities in topology 100, dealer 130 may bea retail outlet for vehicles manufactured by one or more of OEMs 150. Totrack or otherwise manage sales, finance, parts, service, inventory andback office administration needs dealers 130 may employ a dealermanagement system (DMS) 132. Since many DMS 132 are Active Server Pages(ASP) based, transaction data 134 may be obtained directly from the DMS132 with a “key” (for example, an ID and Password with set permissionswithin the DMS system 132) that enables data to be retrieved from theDMS system 132. Many dealers 130 may also have one or more web siteswhich may be accessed over network 170, where pricing data pertinent tothe dealer 130 may be presented on those web sites, including anypre-determined, or upfront, pricing. This price is typically the “nohaggle” (price with no negotiation) price and may be deemed a “fair”price by vehicle data system 120.

Inventory companies 140 may be one or more inventory polling companies,inventory management companies or listing aggregators which may obtainand store inventory data from one or more of dealers 130 (for example,obtaining such data from DMS 132). Inventory polling companies aretypically commissioned by the dealer to pull data from a DMS 132 andformat the data for use on websites and by other systems. Inventorymanagement companies manually upload inventory information (photos,description, specifications) on behalf of the dealer. Listingaggregators get their data by “scraping” or “spidering” websites thatdisplay inventory content and receiving direct feeds from listingwebsites (for example, Autotrader, FordVehicles.com).

DMVs 180 may collectively include any type of government entity to whicha user provides data related to a vehicle. For example, when a userpurchases a vehicle it must be registered with the state (for example,DMV, Secretary of State, etc.) for tax and titling purposes. This datatypically includes vehicle attributes (for example, model year, make,model, mileage, etc.) and sales transaction prices for tax purposes.

Financial institution 182 may be any entity such as a bank, savings andloan, credit union, etc. that provides any type of financial services toa participant involved in the purchase of a vehicle. For example, when abuyer purchases a vehicle they may utilize a loan from a financialinstitution, where the loan process usually requires two steps: applyingfor the loan and contracting the loan. These two steps may utilizevehicle and consumer information in order for the financial institutionto properly assess and understand the risk profile of the loan.Typically, both the loan application and loan agreement include proposedand actual sales prices of the vehicle.

Sales data companies 160 may include any entities that collect any typeof vehicle sales data. For example, syndicated sales data companiesaggregate new and used sales transaction data from the DMS 132 systemsof particular dealers 130. These companies may have formal agreementswith dealers 130 that enable them to retrieve data from the dealer 130in order to syndicate the collected data for the purposes of internalanalysis or external purchase of the data by other data companies,dealers, and OEMs.

Manufacturers 150 are those entities which actually build the vehiclessold by dealers 130. In order to guide the pricing of their vehicles,the manufacturers 150 may provide an Invoice price and a Manufacturer'sSuggested Retail Price (MSRP) for both vehicles and options for thosevehicles—to be used as general guidelines for the dealer's cost andprice. These fixed prices are set by the manufacturer and may varyslightly by geographic region.

External information sources 184 may comprise any number of othervarious source, online or otherwise, which may provide other types ofdesired data, for example data regarding vehicles, pricing,demographics, economic conditions, markets, locale(s), consumers, etc.

It should be noted here that not all of the various entities depicted intopology 100 are necessary, or even desired, in embodiments of thepresent invention, and that certain of the functionality described withrespect to the entities depicted in topology 100 may be combined into asingle entity or eliminated altogether. Additionally, in someembodiments other data sources not shown in topology 100 may beutilized. Topology 100 is therefore exemplary only and should in no waybe taken as imposing any limitations on embodiments of the presentinvention.

Before delving into the details of various embodiments of the presentinvention it may be helpful to give a general overview of an embodimentthe present invention with respect to the above described embodiment ofa topology, again using the example commodity of vehicles. At certainintervals then, vehicle data system 120 may obtain by gathering (forexample, using interface 192 to receive or request) data from one ormore of inventory companies 140, manufacturers 150, sales data companies160, financial institutions 182, DMVs 180, external data sources 184 ordealers 130. This data may include sales or other historical transactiondata for a variety of vehicle configurations, inventory data,registration data, finance data, vehicle data, etc. (the various typesof data obtained will be discussed in more detail later). It should benoted that differing types of data may be obtained at different timeintervals, where the time interval utilized in any particular embodimentfor a certain type of data may be based, at least in part, on how oftenthat data is updated at the source, how often new data of that type isgenerated, an agreement between the source of the data and the providersof the vehicle data system 120 or a wide variety of other factors. Oncesuch data is obtained and stored in data store 122, it may be analyzedand otherwise processed to yield data sets corresponding to particularvehicle configurations (which may include, for example, include vehiclemake, model, power train, options, etc.) and geographical areas(national, regional, local, city, state, zip code, county, designatedmarket area (DMA), or any other desired geographical area).

At some point then, a user at a computing device may access vehicle datasystem 120 using one or more interfaces 192 such as a set of web pagesprovided by vehicle data system 120. Using this interface 192 a user mayspecify a vehicle configuration by defining values for a certain set ofvehicle attributes (make, model, trim, power train, options, etc.) orother relevant information such as a geographical location or incentivesoffered in conjunction with a vehicle of the specified configuration.Information associated with the specified vehicle configuration may thenbe presented to the user through interface 192. Data corresponding tothe specified vehicle configuration can be determined using a data setassociated with the specified vehicle configuration, where thedetermined data may include data such as adjusted transaction prices,mean price, dealer cost, standard deviation or a set of quantifiableprice points or ranges (e.g. “average,” “good,” “great,” “overpriced,”etc. prices). The processing of the data obtained by the vehicle datasystem 120 and the determined data will be discussed in more detaillater in the disclosure.

In particular, pricing data associated with the specified vehicleconfiguration may be determined and presented to the user in a visualmanner. Specifically, in one embodiment, a price curve representingactual transaction data associated with the specified vehicleconfiguration (which may or may not have been adjusted) may be visuallydisplayed to the user, along with visual references indicating one ormore price ranges and one or more reference price points (e.g., invoiceprice, MSRP, dealer cost, market average, dealer cost, internet average,etc.). In some embodiments, these visual indicators may be displayedsuch that a user can easily determine what percentage of consumers paida certain price or the distribution of prices within certain priceranges. Additionally, in some embodiments, the effect, or theapplication, of incentives may be presented in conjunction with thedisplay. Again, embodiments of these types of interfaces will bediscussed in more detail at a later point.

As the information provided by the vehicle data system 120 may proveinvaluable for potential consumers, and may thus attract a large numberof “visitors,” many opportunities to monetize the operation and use ofvehicle data system 120 may present themselves. These monetizationmechanisms include: advertising on the interfaces 192 encountered by auser of vehicle data system 120; providing the ability of dealers toreach potential consumers through the interfaces 192 or through anotherchannel (including offering upfront pricing from dealers to users or areverse auction); licensing and distribution of data (obtained ordetermined); selling analytics toolsets which may utilize data ofvehicle data system 120 or any number of other monetizationopportunities, embodiments of which will be elaborated on below.

Turning now to FIGS. 2A and 2B, one particular embodiment of a methodfor the operation of a vehicle data system is depicted. Referring firstto the embodiment of FIG. 2A, at step 210 data can be obtained from oneor more of the data sources (inventory companies 140, manufacturers 150,sales data companies 160, financial institutions 182, DMVs 180, externaldata sources 184, dealers 130, etc.) coupled to the vehicle data system120 and the obtained data can be stored in the associated data store122. In particular, obtaining data may comprise gathering the data byrequesting or receiving the data from a data source. It will be notedwith respect to obtaining data from data sources that different data maybe obtained from different data sources at different intervals, and thatpreviously obtained data may be archived before new data of the sametype is obtained and stored in data store 122.

In certain cases, some of the operators of these data sources may notdesire to provide certain types of data, especially when such dataincludes personal information or certain vehicle information (VINnumbers, license plate numbers, etc.). However, in order to correlatedata corresponding to the same person, vehicle, etc. obtained fromdifferent data sources it may be desirable to have such information. Toaddress this problem, operators of these data sources may be provided aparticular hashing algorithm and key by operators of vehicle data system120 such that sensitive information in data provided to vehicle datasystem 120 may be submitted and stored in data store 122 as a hashedvalue. Because each of the data sources utilizes the same hashingalgorithm to hash certain provided data, identical data values will haveidentical hash values, facilitating matching or correlation between dataobtained from different (or the same) data source(s). Thus, the datasource operators' concerns can be addressed while simultaneous avoidingadversely impacting the operation of vehicle data system 120.

Once data is obtained and stored in data store 122, the obtained datamay be cleansed at step 220. The cleansing of this data may includeevaluation of the data to determine if it conforms to known values,falls within certain ranges or is duplicative. When such data is found,it may be removed from the data store 122, the values which areincorrect or fall outside a threshold may be replaced with one or morevalues (which may be known specifically or be default values), or someother action entirely may be taken.

This cleansed data may then be used to form and optimize sample sets ofdata at step 230. This formation and optimization process may includegrouping data into data sets according to geography (for example,national, regional, local, state, county, zip code, DMA, some otherdefinition of a geographic area such as within 500 miles of a location,etc.) and optimizing these geographic data sets for a particular vehicleconfiguration. This optimization process may result in one or more datasets corresponding to a particular vehicle or group or type of vehicles,a set of attributes of a vehicle and an associated geography.

Using the data sets resulting from the optimization process, a set ofmodels may be generated at step 240. These models may include a set ofdealer cost models corresponding to one or more of the data setsresulting from the optimization process discussed above. An averageprice ratio (for example, price paid/dealer cost) model for the data setmay also be generated using the obtained data. It will be noted thatthese models may be updated at certain intervals, where the interval atwhich each of the dealer cost models or average price ratio model isgenerated may, or may not, be related to the intervals at which data isobtained from the various data sources or the rate at which the othermodel(s) are generated.

Moving on to the portion of the embodiment depicted in FIG. 2B, at step250 the vehicle data system may receive a specific vehicle configurationthrough a provided interface. In one embodiment, for example, a user ata web page provided by vehicle data system 120 may select a particularvehicle configuration using one or more menus or may navigate through aset of web pages to provide the specific vehicle configuration. Thisspecified vehicle configuration may comprise values for a set ofattributes of a desired vehicle such as a make, model, trim level, oneor more options, etc. The user may also specify a geographic localewhere he is located or where he intends to purchase a vehicle of theprovided specification.

Other information which a user may provide includes incentive datapertaining to the specified vehicle configuration. In one embodiment,when a user specifies a particular vehicle configuration the vehicledata system 120 will present the user with a set of incentivesassociated with the specified vehicle configuration if any areavailable. The user may select zero or more of these incentives toapply.

Pricing data associated with the specified vehicle configuration maythen be determined by the vehicle data system 120 at step 260. This datamay include adjusted transaction prices, mean, median, and probabilitydistributions for pricing data associated with the specified vehicleconfiguration within certain geographical areas (including, for example,the geographic locale specified); calculating a set of quantifiableprice points or ranges (e.g. “average,” “good,” “great,” “overpriced,”etc. prices or price ranges); determining historical price trends orpricing forecasts; or determining any other type of desired data. In oneembodiment, the data associated with the specified vehicle configurationmay be determined using the price ratio model and historical transactiondata associated with the specified vehicle configuration as will bediscussed.

An interface for presentation of the determined pricing data associatedwith the specified vehicle configuration may then be generated at step270. These interfaces may comprise a visual presentation of such datausing, for example, bar charts, histograms, Gaussian curves withindicators of certain price points, graphs with trend lines indicatinghistorical trends or price forecasts, or any other desired format forthe visual presentation of data. In particular, in one embodiment, thedetermined data may be fit and displayed as a Gaussian curverepresenting actual transaction data associated with the specifiedvehicle configuration, along with visual indicators on, or under, thecurve which indicate determined price points or ranges, such as one ormore quantifiable prices or one or more reference price points (forexample, invoice price, MSRP, dealer cost, market average, dealer cost,internet average, etc.). The user may also be presented with datapertaining to any incentive data utilized to determine the pricing data.Thus, using such an interface a user can easily determine certain pricepoints, what percentage of consumers paid a certain price or thedistribution of prices within certain ranges. It should be noted herethat though the interfaces elaborated on with respect to thepresentation of data to a user in conjunction with certain embodimentsare visual interfaces, other interfaces which employ audio, tactile,some combination, or other methods entirely may be used in otherembodiments to present such data.

The interfaces may be distributed through a variety of channels at step280. The channels may comprise a consumer facing network basedapplication (for example, a set of web pages provided by vehicle datasystem 120 which a consumer may access over a network at a computingdevice such as a computer or mobile phone and which are tailored to thedesires of, or use by, consumers); a dealer facing network basedapplication (a set of web pages provided by the vehicle data system 120which are tailored to the desires of, or use by, dealers); text ormultimedia messaging services; widgets for use in web sites or in otherapplication setting, such as mobile phone applications; voiceapplications accessible through a phone; or almost any other channeldesired. It should be noted that the channels described here, andelsewhere, within this disclosure in conjunction with the distributionof data may also be used to receive data (for example, a user specifiedvehicle configuration or the like), and that the same or somecombination of different channels may be used both to receive data anddistribute data.

The distribution of this data through these various channels may bemonetized at step 290. This monetization may be achieved in a number ofways, including by selling display or contextual ads, contextual links,sponsorships, etc. in conjunction with one or more interfaces (such asweb pages, etc.) provided by vehicle data system 120; providing theability of users to purchase vehicles from dealers through one or moreprovided interfaces and charging dealers, users or both to utilize thisservice; providing a reverse auction system whereby dealers can presentprices for particular vehicles to the user and the dealers are chargedfor this ability, charging dealers or users for the licensing orprovisioning of obtained or determined data to the dealers or user;charging for access to tools for manufacturer's, dealers, financialinstitutions, leasing groups, and other end user's which may includecustom analytics or data; or almost any other way desirable to monetizethe applications, capabilities or data associated with vehicle datasystem 120.

As may be apparent from a review of the above discussion, embodiments ofvehicle data system 120 may entail a number of processes occurringsubstantially simultaneously or at different intervals and that manycomputing devices 110 may desire to access vehicle data system 120 atany given point. Accordingly, in some embodiments, vehicle data system120 may be implemented utilizing an architecture or infrastructure thatfacilitates cost reduction, performance, fault tolerance, efficiency andscalability of the vehicle data system 120.

One embodiment of such an architecture is depicted in FIG. 3.Specifically, one embodiment of vehicle data system 120 may be operableto provide a network based interface including a set of web pagesaccessible over the network, including web pages where a user canspecify a desired vehicle configuration and receive pricing datacorresponding to the specified vehicle configuration. Such a vehicledata system 120 may be implemented utilizing a content delivery network(CDN) comprising data processing and analysis servers 310, servicesservers 320, origin servers 330 and server farms 340 distributed acrossone or more networks, where servers in each of data processing andanalysis servers 310, services servers 320, origin servers 330 andserver farms 340 may be deployed in multiple locations using multiplenetwork backbones or networks where the servers may be load balanced asis known in the art.

Data processing and analysis servers 320 may interact with one or moredata sources 350 (examples of which are discussed above) to obtain datafrom these data sources 350 at certain time intervals (for example,daily, weekly, hourly, at some ad-hoc variable interval, etc.) andprocess this obtained data as discussed both above in more detail laterherein. This processing includes, for example, the cleansing of theobtained data, determining and optimizing sample sets, the generation ofmodels, etc.

Origin servers 330 may populate a web cache at each of server farms 340with content for the provisioning of the web pages of the interface tousers at computing devices 360 (examples of which are discussed above).Server farms 340 may provide the set of web pages to users at computingdevices 110 using web caches at each server farm 340. More specifically,users at computing devices 360 connect over the network to a particularserver farm 340 such that the user can interact with the web pages tosubmit and receive data thorough the provided web pages. In associationwith a user's use of these web pages, user requests for content may bealgorithmically directed to a particular server farm 340. For example,when optimizing for performance locations for serving content to theuser may be selected by choosing locations that are the fewest hops, thefewest number of network seconds away from the requesting client or thehighest availability in terms of server performance (both current andhistorical), so as to optimize delivery across the network.

Certain of the web pages or other interfaces provided by vehicle datasystem 120 may allow a user to request services, interfaces or datawhich cannot be provided by server farms 340, such as requests for datawhich is not stored in the web cache of server farms 340 or analyticsnot implemented in server farms 340. User requests which cannot beserviced by server farm 340 may be routed to one of service servers 330.These requests may include requests for complex services which may beimplemented by service servers 330, in some cases utilizing the dataobtained or determined using data processing and analysis servers 310.

It may now be useful to go over in more detail, embodiments of methodsfor the operation of a vehicle data system which may be configuredaccording to embodiments above described architecture or anotherarchitecture altogether. FIGS. 4A and 4B depict one embodiment of justsuch a method. Referring first to FIG. 4A, at step 410 data can beobtained from one or more of the data sources coupled to the vehicledata system and the obtained data stored in a data store. The dataobtained from these various data sources may be aggregated from themultiple sources and normalized. The various data sources and therespective data obtained from these data sources may include somecombination of DMS data 411, inventory data 412, registration or othergovernment (DMV, Sec. of State, etc.) data 413, finance data 414,syndicated sales data 415, incentive data 417, upfront pricing data 418,OEM pricing data 419 or economic data 409.

DMS data 411 may be obtained from a DMS at a dealer. The DMS is a systemused by vehicle dealers to manage sales, finance, parts, service,inventory or back office administration needs. Thus, data which tracksall sales transactions for both new and used cars sold at retail orwholesale by the dealer may be stored in the DMS and obtained by thevehicle data system. In particular, this DMS data 411 may comprise dataon sales transaction which have been completed by the dealer (referredto as historical sales transactions), including identification of avehicle make, model, trim, etc. and an associated transaction price atwhich the vehicle was purchased by a consumer. In some cases, salestransaction data may also have a corresponding dealer cost for thatvehicle. As most DMS are ASP-based, in some embodiments the salestransaction or other DMS data 411 can be obtained directly from the DMSor DMS provider utilizing a “key” (for example, an ID and Password withset permissions) that enables the vehicle data system or DMS pollingcompanies to retrieve the DMS data 411, which in one embodiment, may beobtained on a daily or weekly basis.

Inventory data 412 may be detailed data pertaining to vehicles currentlywithin a dealer's inventory, or which will be in the dealer's inventoryat some point in the future. Inventory data 412 can be obtained from aDMS, inventory polling companies, inventory management companies orlisting aggregators. Inventory polling companies are typicallycommissioned by a dealer to pull data from the dealer's DMS and formatthe data for use on web sites and by other systems. Inventory managementcompanies manually upload inventory information (for example, photos,descriptions, specifications, etc. pertaining to a dealer's inventory)to desired locations on behalf of the dealer. Listing aggregators mayget data by “scraping” or “spidering” web sites that display a dealer'sinventory (for example, photos, descriptions, specifications, etc.pertaining to a dealer's inventory) or receive direct feeds from listingwebsites (for example, FordVehicles.com).

Registration or other government data 413 may also be obtained at step410. When a buyer purchases a vehicle it must be registered with thestate (for example, DMV, Secretary of State, etc.) for tax, titling orinspection purposes. This registration data 413 may include vehicledescription (for example, model year, make, model, mileage, etc.) and asales transaction price which may be used for tax purposes.

Finance and agreement data 414 may also be obtained. When a buyerpurchases a vehicle using a loan or lease product from a financialinstitution, the loan or lease process usually requires two steps:applying for the loan or lease and contracting the loan or lease. Thesetwo steps utilize vehicle and consumer information in order for thefinancial institution to properly assess and understand the risk profileof the loan or lease. This finance application or agreement data 414 mayalso be obtained at step 410. In many cases, both the application andagreement include proposed and actual sales prices of the vehicle.

Syndicated sales data 415 can also be obtained by the vehicle datasystem at step 410. Syndicated sales data companies aggregate new andused sales transaction data from the DMS of dealers with whom they arepartners or have a contract. These syndicated sales data companies mayhave formal agreements with dealers that enable them to retrievetransaction data in order to syndicate the transaction data for thepurposes of analysis or purchase by other data companies, dealers orOEMs.

Incentive data 416 can also be obtained by the vehicle data system. OEMsuse manufacturer-to-dealer and manufacturer-to-consumer incentives orrebates in order to lower the transaction price of vehicles or allocateadditional financial support to the dealer to help stimulate sales. Asthese rebates are often large (2%-20% of the vehicle price) they canhave a dramatic effect on vehicle pricing. These incentives can bedistributed to consumers or dealers on a national or regional basis. Asincentives may be vehicle or region specific, their interaction withpricing can be complex and an important tool for understandingtransaction pricing. This incentive data can be obtained from OEMs,dealers or another source altogether such that it can be used by thevehicle data system to determine accurate transaction, or other, pricesfor specific vehicles.

As dealers may have the opportunity to pre-determine pricing on theirvehicles it may also be useful to obtain this upfront pricing data 418at step 410. Companies like Zag.com Inc. enable dealers to inputpre-determined, or upfront, pricing to consumers. This upfront price istypically the “no haggle” (price with no negotiation) price. Manydealers also present their upfront price on their websites and evenbuild their entire business model around the notion of “no negotiation”pricing. These values may be used for a variety of reasons, includingproviding a check on the transaction prices associated with obtainedhistorical transaction data.

Additionally, OEM pricing data 419 can be obtained at step 410. This OEMpricing data may provide important reference points for the transactionprice relative to vehicle and dealer costs. OEMs usually set twoimportant numbers in the context of vehicle sales, invoice price andMSRP (also referred to as sticker price) to be used as generalguidelines for the dealer's cost and price. These are fixed prices setby the manufacturer and may vary slightly by geographic region. Theinvoice price is what the manufacturer charges the dealer for thevehicle. However, this invoice price does not include discounts,incentives, or holdbacks which usually make the dealer's actual costlower than the invoice price. According to the American AutomobileAssociation (AAA), the MSRP is, on average, a 13.5% difference from whatthe dealer actually paid for the vehicle. Therefore, the MSRP is almostalways open for negotiation. An OEM may also define what is known as adealer holdback, or just a holdback. Holdback is a payment from themanufacturer to the dealer to assist with the dealership's financing ofthe vehicle. Holdback is typically a percentage (2 to 3%) of the MSRP.

Although the MSRP may not equate to an actual transaction price, aninvoice price can be used to determine an estimate of a dealer's actualcost as this dealer cost is contingent on the invoice. The actual dealercost can be defined as invoice price less any applicablemanufacturer-to-dealer incentives or holdbacks. The vehicle data systemmay therefore utilize the invoice price of a vehicle associated with ahistorical transaction to determine an estimate of the dealer's actualcost which will enable it to determine “front-end” gross margins (whichcan be defined as the transaction price less dealer cost and may notinclude any margin obtained on the “back end” including financing,insurance, warranties, accessories and other ancillary products).

Data may also be obtained from a wide variety of other data sources,including economic data 409 related to the current, past or future stateof almost any facet of the economy including gas prices, demographicdata such as household income, markets, locale(s), consumers, or almostany other type of data desired. The economic data may be specific to, orassociated with, a certain geographic area. Additionally, this economicdata may comprise an internet index, which may be determined from theaverage price for a vehicle as reported by certain Internet researchsites as the average price for a vehicle. Although these Internetresearch sites are typically consumer focused, they sell advertising andleads to the automotive dealerships; therefore their paying customersare dealerships and the prices on these sites tend to represent thehigher end of the scale, favoring dealerships.

Once the desired data is obtained, the obtained data may be cleansed atstep 420. In particular, the data obtained may not be useful if it isinaccurate, duplicative or does not conform to certain parameters.Therefore, the vehicle data system may cleanse obtained data to maintainthe overall quality and accuracy of the data presented to end users.This cleansing process may entail the removal or alteration of certaindata based on almost any criteria desired, where these criteria may, inturn, depend on other obtained or determined data or the evaluation ofthe data to determine if it conforms with known values, falls withincertain ranges or is duplicative. When such data is found it may beremoved from the data store of the vehicle data system, the values whichare incorrect or fall outside a threshold may be replaced with one ormore values (which may be known specifically or be default values), orsome other action entirely may be taken.

In one embodiment, during this cleansing process a VIN decode 428 maytake place, where a VIN number associated with data (for example, ahistorical transaction) may be decoded. Specifically, every vehicle soldmust carry a Vehicle Identification Number (VIN), or serial number, todistinguish itself from other vehicles. The VIN consists of 17characters that contain codes for the manufacturer, year, vehicleattributes, plant, and a unique identity. Vehicle data system may use anexternal service to determine a vehicle's attributes (for example, make,model year, make, powertrain, trim, etc.) based on each vehicles VIN andassociate the determined vehicle information with the sales transactionfrom which the VIN was obtained. Note that in some cases, this data maybe provided with historical transaction data and may not need to occurwith respect to one or more of the historical transactions.

Additionally, inaccurate or incomplete data may be removed 422. In oneembodiment, the vehicle data system may remove any historicaltransaction data that does not include one or more key fields that maybe utilized in the determination of one or more values associated withthat transaction (for example, front end gross, vehicle make, model ortrim, etc.). Other high-level quality checks may be performed to removeinaccurate (including poor quality) historical transaction data.Specifically, in one embodiment cost information (for example, dealercost) associated with a historical transaction may be evaluated todetermine if it is congruent with other known, or determined, costvalues associated with the make, model or trim of the vehicle to whichthe historical transaction data pertains. If there is an inconsistency(for example, the cost information deviates from the known or determinedvalues by a certain amount) the cost information may be replaced with aknown or determined value or, alternatively, the historical transactiondata pertaining to that transaction may be removed from the data store.

In one embodiment, for each historical transaction obtained thefollowing actions may be performed: verifying that the transaction pricefalls within a certain range of an estimated vehicle MSRP correspondingto the historical transaction (e.g. 60% to 140% of MSRP of the basevehicle); verifying that the dealer cost for the transaction fallswithin a range of an estimated dealer cost (e.g. 70% to 130% ofinvoice−holdback of the base vehicle); verifying that a total gross(front end+back end gross) for the historical transaction is within anacceptable range (e.g. −20% to 50% of the vehicle base MSRP); verifyingthat the type of sale (new/used) aligns to the number of miles of thevehicle (for example, more than 500 miles, the vehicle should not beconsidered new).

In addition, the new car margin (front-end gross) may be adjusted up ordown for transactions that have a high or low back-end gross. Thisadjustment may be a combination of the magnitude of the back-end grossand a factor based on historical analysis (for example, for a dealershiphaving a sales transaction comprising a trade amount of $5000 and anactual trade value of $7000 and thus made $2000 on the vehicle trade,the front-end gross for this sales transaction vehicle would beincreased by this $2000 since this dealer would have accepted a lowertransaction price). The front end gross may also be adjusted based onrebates or incentives from the manufacturer that go directly to thedealers, as only a percentage of this rebate gets passed onto thecustomer. The exact factor to utilize in a given instance may bedetermined based on historical analysis and current market conditions.For example, if a manufacturer is offering $5000 in marketing support toa dealer, a dealer is not required to pass this money on to the endcustomer, however, a percentage of this money (e.g. 50%-80%) is usuallygiven to the customer in the form of a lower transaction price).Furthermore, the front-end gross may be adjusted according to a numberof minor factors that change the front-end gross based on the accountingpractices of an individual dealership. For example, some dealers adjustthe front-end gross to affect the salesperson's commission; theseadjustments are removed when possible.

Duplicate data may also be removed 424. As there may be many sources forhistorical transaction data in many cases duplicative historicaltransaction data may be obtained. As such duplicative data can skew theresults of the output of the vehicle data system it may be desired toremove such duplicate data. In cases where uniquely identifiableattributes such as the VIN are available, this process is straightforward (for example, VINs associated with historical transactions maybe matched to locate duplicates). In cases where the transaction datadoes not have a unique attribute (in other words an attribute whichcould pertain to only one vehicle, such as a VIN, a combination ofavailable attributes may be used to determine if a duplicate exists. Forexample, a combination of sales date, transaction type, transactionstate, whether there was a trade-in on the transaction, the vehicletransaction price or the reported gross may all be used to identifyduplicates. In either case, once a duplicate is identified, thetransaction data comprising the most attributes source may be kept whilethe duplicates are discarded. Alternatively, data from the duplicatehistorical transactions may be combined in some manner into a singlehistorical transaction.

Outlier data can also be removed 426. Outlier data is defined as datathat does not appear to properly represent a likely transaction. In oneembodiment, historical transaction data pertaining to transactions witha high negative margin (dealer loses too much money) or a high positivemargin (dealers appears to earn too much money) may be removed. Removingoutlier data may, in one embodiment, be accomplished by removing outlierdata with respect to national, regional, local or other geographicgroupings of the data, as removing outlier data at different geographiclevel may remove different sets of transaction data. In addition,relative or absolute trimming may be used such that a particularpercentage of the transactions beyond a particular standard deviationmay be removed off of the top and bottom of the historical transactions.

After step 420, cleansed data may be stored in a data store associatedwith the vehicle data system, where the cleansed data includes a set ofhistorical transactions, each historical transaction associated with atleast a set of vehicle attributes (for example, make, model, enginetype, trim, etc.) and a transaction price or front end gross.

After step 420, cleansed data may be stored in a data store associatedwith the vehicle data system, where the cleansed data includes a set ofhistorical transactions, each historical transaction associated with atleast a set of vehicle attributes (for example, make, model, enginetype, trim, etc.) and a transaction price or front end gross.

At step 430, then, the cleansed data may be grouped according togeography into data sets using a binning process and these geographicdata sets optimized for a particular vehicle configuration. Thisoptimization process may result in one or more data sets correspondingto a specific vehicle or group or type of vehicles, a trim level or setof attributes of a vehicle, and an associated geography.

In one embodiment, permutations of attributes may be iterated over todetermine the attribute that has the most significant impact on margin.The iterations may continue until a stack ranked list of attributes frommost to least significant impact on the margin are determined. Then,when grouping transactions for a particular location and vehicle thisranked list can be utilized to produce a data set that is bothsignificant and relevant by ignoring or giving less weight to attributesthat will impact margin the least.

In order to make vehicle pricing data more accurate, it may be importantto maintain timeliness or relevancy of the data presented or utilized.In one embodiment, then the total number of recent (within a desiredtime period) and relevant transactions may be optimized with respect tothe cleansed data. Relevant data corresponding to a particulargeographic region and a particular vehicle may be binned to optimize thequantity of data available for each vehicle within each geographicregion. This quantity of data may be optimized to yield bins ofhistorical transaction data corresponding to a trim level (a certain setof attributes corresponding to the vehicle) of a particular model carand an associated geography using geographic assignment of data 432 andattribute categorization and mapping to trim 436.

During geographic assignment of data 432, data is labeled with one ormore of national (all data), regional, state, or DMA definition.Attribute categorization and trim mapping 436 may also occur. Vehicledata can be sorted at the trim level (for example, using data regardingthe vehicle obtained from a VIN decode or another source). This enablesthe accurate presentation of relevant pricing based on similar vehicleswithin a given time frame (optimizing recency). In some cases, adetermination may be made that there is not a threshold quantity of datafor a specific vehicle at a trim level to determine a statisticallysignificant data corresponding to a time period.

The vehicle data system analyzes vehicles at the model (e.g., Accord,Camry, F-150) level and runs analytics at an attribute level (forexample, drivetrain, powertrain, body type, cab type, bed length, etc.)to determine if there is a consistency (correlation between attributesand trims) at the attribute level. Since there are a greater number oftransactions when binning at an attribute level, attribute level binningmay be used instead of trim level binning in these situations, therebyyielding a larger number of historical transactions in a particular dataset (relative to just trim level binning), but still relevant, data setto use for processing.

It will be noted with respect to these data sets that data within aparticular data set may correspond to different makes, models, trimlevels or attributes based upon a determined correlation betweenattributes. For example, a particular data set may have datacorresponding to different makes or models if it is determined thatthere is a correlation between the two vehicles. Similarly, a particulardata set may have data corresponding to different trims or havingdifferent attributes if a correlation exists between those differenttrim levels or attributes.

Using the historical transaction data a set of models may be generatedat step 440. This model generation process may comprise analyzingindividual aspects of the historical transaction data in order tounderstand the margin for the seller based on the attributes, geographyor time of sale. Understanding the margin of individual historicaltransactions allows these historical transactions to be grouped instatistically significant samples that are most relevant to anindividual user based on their specifically configured vehicle andlocation.

Thus, the generated models may include a set of dealer cost modelscorresponding to each of the one or more data sets. From these dealercost models and the historical transaction data associated with a dataset, an average price ratio (for example, price paid/dealer cost) may begenerated for a data set corresponding to a specific vehicleconfiguration using a price ratio model. These models will be discussedin more detail later in this disclosure.

Moving on to the portion of the embodiment depicted in FIG. 4B, at step450 the vehicle data system may receive a specific vehicle configuration452 through a provided interface. In one embodiment, for example, a userat a web page provided by the vehicle data system may select aparticular vehicle configuration using one or more menus or may navigatethrough a set of web pages to provide the specific vehicle configuration452. The user may also specify a geographic locale where he is locatedor where he intends to purchase a vehicle of the provided specification,or may select one or more consumer incentives which the user may desireto utilize in conjunction with a potential purchase. The providedinterface may also be used to obtain other data including incentive datapertaining to the specified vehicle configuration. In one embodiment,when a user specifies a particular vehicle configuration an interfacehaving a set of incentives associated with the specified vehicleconfiguration may be presented to a user if any such incentives areavailable. The user may select zero or more of these incentives toapply.

Data associated with the specified vehicle configuration which providedby the user may then be determined by the vehicle data system at step460. Specifically, in one embodiment, the vehicle data system mayutilize one or more of models 462 (which may have been determined abovewith respect to step 440) associated with the vehicle configurationspecified by the user (for example, associated with the make, model,trim level or one or more attributes of the specified vehicle) toprocess one or more data sets (for example, historical transaction datagrouped by vehicle make, model, trim or attributes, various geographicareas, etc. associated with the specified vehicle configuration) inorder to determine certain data corresponding to the user's specifiedvehicle.

The determined data corresponding to the specified vehicle configurationmay include adjusted transaction prices and mean, median or probabilitydistribution 464 associated with the specified vehicle at a national,regional or local geographical level. The data set corresponding to thespecified vehicle may also be bucketed 466 (for example, percentilebucketed) in order to create histograms of data at national, regional,and local geographic levels. “Good,” “great,” or other prices andcorresponding price ranges 468 may also be determined based on median,floor pricing (lowest transaction prices of the data set correspondingto the specified vehicle configuration) or algorithmically determineddividers (for example, between the “good,” “great,” or “overpriced”ranges). Each price or price range may be determined at national,regional, and local geographic levels. These prices or price ranges maybe based on statistical information determined from the data setcorresponding to the specified vehicle. For example, “good” and “great”prices or price ranges may be based on a number of standard deviationsfrom a mean price associated with the sales transactions of the data setcorresponding to the specified vehicle. For example, a “great” pricerange may be any price which is more than one half a standard deviationbelow the mean price, while a “good” price range may be any price whichis between the mean price and one half standard deviation below themean. An “overpriced” range may be anything above the average price orthe mean or may be any price which is above the “good” price range.

Historical average transaction prices and forecasts 469 corresponding tothe specified vehicle configuration may also be determined at national,regional, and local geographic levels where the forecasted pricing canbe determined based on historical trends in the data set correspondingto the specified vehicle, as well as forecasted inventory, model yearcycles, incentives or other variables.

Based on the determined data, an interface for the presentation of thedetermined data may then be generated at step 470. The interfacegenerated may be determined in accordance with a user request receivedat the vehicle data system based on a user's interaction with otherinterfaces provided by the vehicle data system. In this manner a usermay “navigate” through the interfaces provided by the vehicle datasystem to obtain desired data about a specified vehicle configurationpresented in a desired manner.

These interfaces may serve to communicate the determined data in avariety of visual formats, including simplified normal distributions andpricing recommendations based on one or more data sets. In someembodiments, a price distribution for a particular data set associatedwith a specified vehicle configuration can be presented to users as aGaussian curve 472. Using the normal distribution of transaction data ina given geographic area, the mean and the variance of pricing can bevisually depicted to an end user. Visually, the Gaussian curve 472 maybe shown to illustrate a normalized distribution of pricing (forexample, a normalized distribution of transaction prices). On thecurve's X-axis, the average price paid may be displayed along with thedetermined dealer cost, invoice or sticker price to show these pricesrelevancy, and relation, to transaction prices. The determined “good,”“great,” “overpriced,” etc. price ranges are also visually displayedunder the displayed curve to enable the user to identify these ranges.Incentive data utilized to determine the presented data may also bedisplayed to the user.

A histogram 474 may also be created for display to a user. The histogramis a graphical display of tabulated frequencies of the data set ordetermined data comprising a set of bars, where the height of the barshows the percentage of frequency, while the width of the barsrepresents price ranges. On the histogram's X-axis, the average pricepaid, dealer cost, invoice, and sticker price may be displayed to showtheir relevancy, and relation, to transaction prices. The determined“good,” “great,” etc. prices or ranges may also visually displayed withthe histogram to enable the user to identify these ranges. Incentivedata utilized to determine the presented data may also be displayed tothe user.

Interfaces for determined historic trends or forecasts 478 may also begenerated. For example, a historical trend chart may be a line chartenabling a user to view how average transaction prices have changed overa given period of time. The Y-axis represents the percentage change overgiven time periods while the X-axis represents given time periods. Theuser will also be able to view the average transaction price and averageincentives over each given time period. In addition, the user will alsobe able to see how prices may change in the future based on algorithmicanalysis. Other types of interfaces, such as bar charts illustratingspecific price points (for example, average price paid, dealer cost,invoice, and sticker price) and ranges (for example, “good,” “great,”“overpriced,” etc.) in either a horizontal or vertical format, may alsobe utilized.

Using these types of visual interfaces may allow a user to intuitivelyunderstand a price distribution based on relevant information for theirspecific vehicle, which may, in turn, provide these users with strongfactual data to understand how much variation there is in pricing and tonegotiate, and understand what constitutes, a good deal. Additionally,by displaying the data sets associated with different vehicles insubstantially the same format users may be able to easily comparepricing data related to multiple vehicles or vehicle configurations.

The generated interfaces can be distributed through a variety ofchannels at step 480. It will be apparent that in many cases the channelthrough which an interface is distributed may be the channel throughwhich a user initially interacted with the vehicle data system (forexample, the channel through which the interface which allowed the userto specify a vehicle was distributed). However, it may also be possibleto distribute these interfaces through different data channels as well.Thus, interfaces which present data sets and the results of theprocessing of these data sets may be accessed or displayed usingmultiple interfaces and will be distributed through multiple channels,enabling users to access desired data in multiple formats throughmultiple channels utilizing multiple types of devices. Thesedistribution methods may include but are not limited to: consumer anddealer facing Internet-based applications 482. For example, the user maybe able access an address on the World Wide Web (for example,www.truecar.com) through a browser and enter specific vehicle andgeographic information via its web tools. Data pertaining to thespecific vehicle and geographic information may then be displayed to theuser by presenting an interface at the user's browser. Data and onlinetools for the access or manipulation of such data may also bedistributed to other automotive related websites and social networkingtools throughout the web. These Internet-based applications may alsoinclude, for example, widgets which may be embedded in web sitesprovided by a third party to allow access to some, or all, of thefunctionality of the vehicle data system through the widget at the thirdparty web site. Other Internet-based applications may includeapplications that are accessible through one or more social networkingor media sites such as Facebook or Twitter, or that are accessiblethrough one or more APIs or Web Services.

A user may also use messaging channels 484 to message a specificvehicle's VIN to the vehicle data system (for example, using a text,picture or voice message). The vehicle data system will respond with amessage that includes the specific vehicle's pricing information (forexample, a text, picture or voice message). Furthermore, in certainembodiment, the geographical locale used to determine the presentedpricing information may be based on the area code of a number used by auser to submit a message or the location of a user's computing device.In certain cases, if no geographical locale can be determined, one maybe asked for, or a national average may be presented.

In one embodiment, a user may be able to use phone based applications486 to call the vehicle data system and use voice commands to provide aspecific vehicle configuration. Based on information given, the vehicledata system will be able to verbally present pricing data to the user.Geography may be based on the area code of the user. If an area codecannot be determined, a user may be asked to verify their location bydictating their zip code or other information. It will be noted thatsuch phone based applications 486 may be automated in nature, or mayinvolve a live operator communicating directly with a user, where thelive operator may be utilizing interfaces provided by the vehicle datasystem.

As the vehicle data system may provide access to different types ofvehicle data in multiple formats through multiple channels, a largenumber of opportunities to monetize the vehicle data system may bepresented to the operators of such a system. Thus, the vehicle datasystem may be monetized by its operators at step 490. More specifically,as the aggregated data sets, the results or processing done on the datasets or other data or advantages offered by the vehicle data system maybe valuable, the operators of the vehicle data system may monetize itsdata or advantages through the various access and distribution channels,including utilizing a provided web site, distributed widgets, data, theresults of data analysis, etc. For example, monetization may be achievedusing automotive (vehicle, finance, insurance, etc.) related advertising491 where the operators of the vehicle data system may sell display ads,contextual links, sponsorships, etc. to automotive related advertisers,including OEMs, regional marketing groups, dealers, finance companies orinsurance providers.

Additionally, the vehicle data system may be monetized by facilitatingprospect generation 493 based on upfront, pre-determined pricing. Asusers view the vehicle data system's interfaces they will also have theoption to accept an upfront price (which may, for example, fall into thepresented “good” or “great” price ranges). This price will enable a userto purchase a car without negotiating.

Operators of the vehicle data system may also monetize its operation byimplementing reverse auctions 496 based on a dealer bidding system orthe like. Dealers may have an opportunity through the vehicle datasystem to bid on presenting upfront pricing to the user. The lower theprice a dealer bids, the higher priority they will be in the vehicledata system (for example, priority placement and first price presentedto user), or some other prioritization scheme may be utilized. Userswill be able to view bidders in a user-selected radius of the user's zipcode or other geographic area and select a winning bidder. Embodimentsof the implementation of such a reverse auction may be better understoodwith reference to U.S. patent application Ser. No. 12/556,109, filedSep. 9, 2009, entitled “SYSTEM AND METHOD FOR SALES GENERATION INCONJUNCTION WITH A VEHICLE DATA SYSTEM,” which is incorporated herein byreference in its entirety for all purposes.

The operators of vehicle data system may also license 492 data, theresults of data analysis, or certain applications to applicationproviders or other websites. In particular, the operators of the vehicledata system may license its data or applications for use on or withcertain dealer tools, including inventory management tools, DMS, dealerwebsite marketing companies, etc. The operators of the vehicle datasystem may also license access to its data and use of it tools onconsumer facing websites (for example, Yahoo! Autos or the like).

Monetization of the vehicle data system may also be accomplished byenabling OEMs to buy contextual ads 495 on certain applications such asdistributed widgets or the like. Users may see such ads as “othervehicles to consider” on the widget. The operators may also develop andsell access to online tools 497 for OEMs, finance companies, leasingcompanies, dealer groups, and other logical end users. These tools 497will enable customers to run customized analytic reports which may notbe available on the consumer facing website, such as statisticalanalysis toolsets or the like.

As the accuracy and the specificity of pricing information may be asignificant advantage of embodiments of a vehicle data system presentedherein, it may now be useful to present an overview of embodiments ofthe analytics which may be employed by a vehicle data system toillustrate how such pricing information is determined. Specifically, inone embodiment the data feeds from information sources may be leveragedto model variables and build multivariable regressions. Moreparticularly, in one embodiment, using one set of historical data a setof dealer cost models may be determined as a formula based on invoiceand MSRP data and, using a second set of historical data a price ratioregression model may be determined, such that the vehicle data systemmay be configured to utilize these determined dealer cost models and theprice ratio regression model in the calculation of pricing datacorresponding to a user specified vehicle configuration.

When such a specified vehicle configuration is received, the historicaltransaction data associated with that specified vehicle configurationcan be obtained. The transaction prices associated with the historicaltransaction data can be adjusted for incentives and the dealer costmodel and price ratio model applied to determine desired data to presentto the user. Specifically, in one embodiment, the user may provide sucha specific vehicle configuration to the vehicle data system using aninterface provided by the vehicle data system. The user may also selectone or more currently available incentives to apply, where the currentlyavailable incentives are associated with the specified vehicleconfiguration. The specified vehicle configuration may define values fora set of attributes of a desired vehicle (for example, includingtransmission type, MSRP, invoice price, engine displacement, enginecylinders, # doors, body type, geographic location, incentivesavailable, etc.) where the values for these attributes may be specifiedby the user or obtained by the vehicle data system using the values ofattributes specified by the user. Based on the values of theseattributes, the specified vehicle's bin may be identified. In oneembodiment, a bin for a vehicle can be is defined as the group ofvehicles that have the same year, make, model and body type for whichthere is historical transactions data within a certain time period (forexample, the past four weeks or some other time period).

Using the pricing information associated with the historicaltransactions in the bin corresponding to the specified vehicle, steadystate prices may be determined by removing incentives from the prices inthe historical transaction data. Once accurate transaction prices aredetermined, an average price and average cost for the specified vehiclemay be computed using the historical transaction data associated withthe bin of the specified vehicle. This bin-level determined averageprice and average cost may, in turn, be used along with the specifiedvehicle configuration to determine the average price ratio for thespecified vehicle by plugging these values into the price ratioregression model and solving. Using this average price ratio and theprices paid (for example, adjusted for incentives) corresponding to thehistorical transaction data within the specified vehicle's bin, certainprice ranges may be computed (for example, based on standard deviationsfrom a price point (for example, the mean)). A Gaussian curve can thenbe fit parametrically to the actual price distributions corresponding tothe historical transaction data of the bin and the result visuallydisplayed to the user along with the computed price points.

Turning to FIG. 5, one embodiment for a method of determining accurateand relevant vehicle pricing information is depicted. At step 510 datamay be obtained and cleansed as described above. This data includes aset of historical transaction data, where the historical transactiondata may comprise data on a set of transactions which have occurred,where data for a particular historical transaction may comprise one ormore prices associated with a vehicle actually sold to a consumer,including for example, an invoice price, a dealer cost, an MSRP, a pricepaid by the consumer (also known as a transaction price), etc. andvalues for a set of attributes corresponding to the vehicle sold (forexample, make, model, transmission type, number of doors, power train,etc.). This historical transaction data may then be cleansed. Thiscleansing may entail an exclusion of certain historical transactionsbased on data values (for example a transaction having a sale price of$5,021 may be deemed to be too low, and that sales transaction excluded)or the replacement of certain values associated with a historicaltransaction.

In certain embodiments, it may be desirable to be able to accuratelydetermine dealer cost associated with historical transactions, as thisdealer cost may be important in determining pricing data for a user, aswill be discussed. While certain data sources may supply gross profitdata in conjunction with provided historical transaction data, and thisgross profit field may be used to determine dealer cost, this grossprofit data is often times unreliable. In one embodiment, then, whenhistorical transaction data is cleansed, a dealer cost corresponding toeach of a set of historical transactions may be determined using thedealer cost models associated with the vehicle data system, and thedetermined dealer cost associated with the corresponding historicaltransaction if the historical transaction does not have an associateddealer cost. Additionally, a dealer cost which is associated with areceived historical transaction may be evaluated utilizing a determineddealer cost corresponding to that transaction such that the originaldealer cost may be replaced with the determined dealer cost if theoriginal dealer cost is determined to deviate from the determined dealercost by some threshold, or is otherwise determined to be incorrect.Embodiments of methods for the determination of dealer cost for use inthis type of cleansing will be described in more detail at a later pointwith reference to FIG. 19.

Once the historical transaction data is obtained and cleansed, dealercost models may be determined at step 520. More specifically, in oneembodiment, a dealer cost model may be generated for each of a set ofmanufacturers by analyzing invoice data corresponding to thatmanufacturer (which may be received from dealers). In particular, theinvoice data may be analyzed to determine the equation for derivingholdback in the dealer cost relationship (for example, where dealercost=invoice−holdback).

The invoice data usually provided with each vehicle invoice contains thefollowing: the holdback price, the invoice price, the freight chargesand MSRP, among other data. Thus, taking each vehicle invoice as aseparate observation and assuming that each equation for the dealer costalways takes a similar form, the various forms of the equation can beplotted to see which equation holds most consistently acrossobservations. The equation which holds most consistently can be deemedto be the holdback equation (referred to as the dealer cost (DealerCost)model) for that manufacturer.

Turning briefly to FIG. 6, a graphic depiction of a plot of holdbackequations applied to vehicle invoice prices for one particularmanufacturer (Ford) is presented. Here, holdback can be determined tobe: holdback=0.03*(configured msrp−freight) for this particularmanufacturer, as this is the only form that holds constant acrossinvoices associated with Ford. It will be noted that the determinationof these dealer cost models may take place at almost any time intervaldesired, where the time interval may differ from the time interval usedto obtain data from any of the data sources, and that these dealer costmodels need not be determined anew when new data is obtained. Thus,while the determination of dealer cost models has been described hereinwith respect to the embodiment depicted in FIG. 5 it will be noted thatthis step is not a necessary part of the embodiment of the methoddescribed and need not occur at all or in the order depicted withrespect to his embodiment. For example, dealer cost models may bedetermined offline and the vehicle data system configured to use theseprovided dealer cost models.

Returning to FIG. 5, in addition to the dealer cost models, a priceratio regression equation may be determined at step 530 using historicaltransaction data. Utilizing global multivariable regression, then, oneembodiment a price ratio equation may be of the form f(χ)=Σ

₌

^(n)Σ_(k=)

^(m)(β

X

X_(bk)) where X

signifies global variables, X_(bk) signifies bin-level variables forspecific bins b, and β

′ε are coefficients. In one embodiment, for example, the price ratio(PriceRatio) equation may bePriceRatio=a0+a1*PRbin+a2*PRbin*dealercost+a3*PRbin*cylinders+a4*PRbin*drive+a5*PRbin*daysinmarket+Σ(a_(k)*PRbin*state_(k))where a_(i)=coefficients, PRbin is the 4-week average price ratios forall transactions in a bin associated with a given vehicle, dealercost isa steady-state (incentives adjusted) dealer cost for the given vehicle,cylinders are the number of cylinders the given has, drive is the numberof drive wheel in the drivetrain (e.g. 2 or 4 wheel drive), daysinmarketis the number of days the model of the given vehicle has been on themarketplace and state is an array of indicator variables specifying thegeographic

state of purchase. With this price ratio equation it is possible tocompute average price paid for the given vehicle where average pricepaid (Avg Price Paid) equals PriceRatio (as determined from the priceratio regression equation) multiplied by DealerCost (as determined fromthe dealer cost model for the manufacturer of the given vehicle) or AvgPrice Paid=PriceRatio(DealerCost).

In one embodiment, it may be desirable to model price ratios at a locallevel. Accordingly, certain embodiments of a price ratio equation mayaccount for this desire by incorporation of zip code level modeling. Forexample, in the price ratio equation above, in place of an array ofindicator variables identifying a state, variables to capture thezipcode may be included. In the context of vehicle pricing data justincorporating a series of indicator variables identifying zipcode may,however, be less effective due to data sparsity issues, while a straightcontinuous mapping of zipcode may also be less effective than desireddue to overconstrained implied numerical relationships amongst zipcodes.Accordingly, an indirect continuous mapping may be utilized in certainembodiments, particularly in cases where intermediary variables can beidentified. For instance, continuous variables such as median income andmedian home price can effectively be leveraged as intermediaries. Giventhat zipcode is directly related (sometimes referred to as a proxyvariable) for these effects, it makes sense to use these types ofcontinuous variables as intermediaries.

To accomplish this, in one embodiment first a model which relateszipcode to median income is developed. This model can be, for example, alookup table of median incomes by zipcode (which can be for example,acquired from the most recent census data). Then, median income isutilized as a variable X_(i) in, for example, the price ratio equationabove. The price ratio equation might then have a component ofa6*est_median_income or a6*PRbin*est_median_income, whereest_median_income=f(zipcode) (where f(zipcode) refers to a value in thelookup table corresponding to zipcode.) Thus, a price ratio equation ofthis type may bePriceRatio=a0+a1*PRbin+a2*PRbin*dealercost+a3*PRbin*cylinders+a4*PRbin*drive+a5*PRbin*daysinmarket+a6*PRbin*est_median_incomewhere a_(i)=coefficients, PRbin is the 4-week average price ratios forall transactions in a bin associated with a given vehicle, dealercost isa steady-state (incentives adjusted) dealer cost for the given vehicle,cylinders is the number of cylinders the given has, drive is the numberof drive wheel in the drivetrain (e.g., 2 or 4 wheel drive),daysinmarket is the number of days the model of the given vehicle hasbeen on the marketplace and f(zipcode) refers to a value in a lookuptable corresponding to the zipcode. It will be noted that a similarapproach can be taken with median home prices or any other suchpotential intermediary variable which it is desired to utilize inconjunction with any type of local level variable (zip code,neighborhood, area code, etc.).

Again, it will be noted that the determination of the price ratioequation to utilize may take place at almost any time interval desired,where the time interval may differ from the time interval used to obtaindata from any of the data sources, and that a price ratio equation neednot be determined anew when new data is obtained. Thus, while thedetermination of a price ratio equation has been described herein withrespect to the embodiment depicted in FIG. 5 it will be noted that thisstep is not a necessary part of the embodiment of the method described.For example, a price ratio equation may be determined offline and thevehicle data system configured to use this provided price ratioequation.

Once the data has been gathered, and the dealer models and price ratioregression equation to utilize have been determined, a specified vehicleconfiguration may be received and a corresponding bin determined atsteps 540 and 550, respectively. A specified vehicle configuration maycomprise values for a set of attributes of a vehicle (for example, inone embodiment the attributes of year, make, model and body type may beused). Thus, a bin corresponding to a specified vehicle configurationmay comprise historical transaction data from a particular time period(for example, four weeks) associated with the values for the set ofattributes corresponding to the specified vehicle.

Using the bin corresponding to the specified vehicle, at step 560,steady state pricing for the historical transaction data in the bin maybe determined. Steady state prices may be determined by removingincentives from the transaction prices in the historical data. Morespecifically, transaction prices can be adjusted for incentives usingthe equation Price_ss (steady state price)=Price (transactionprice)+I_(c)+λI_(d), where I_(c)=consumer incentives applied to thetransaction, I_(d)=dealer incentives available for the transaction, andλ=dealer incentives passthrough rate. Thus, if a historical transactionprice included $500 in consumer incentives and $1000 in available dealerincentives for a dealer that has been determined to have a 20% dealercash passthrough rate, that price would be adjusted to be $700 higher toaccount for the incentives provided at that time.

For instance, a price paid (transaction price) of $15,234 correspondingto a historical sales transaction for a Honda Civic might have beenartificially low due to incentives. Since the incentives are known atthe time that historical transaction took place, it can be determinedwhat incentives were available at that time and how they affect theprices corresponding to a historical transaction (for example, whatpercentage of these incentives are passed through to the customer). Asdealer incentives are unknown to the consumer generally and may or maynot be passed through, historical transaction data can be evaluated todetermine passthrough percentages for these dealer incentives based onhistorical averages and adjusted accordingly.

For instance, using the example Honda Civic transaction, a $1500consumer and a $1000 dealer incentive might have been available. Sinceconsumer incentives are 100% passed through to the consumer, that $1500may be added to the historical transaction price to adjust the price ofthe transaction to $16734. For this particular make of vehicle, themanufacturer-to-dealer incentive passthrough rate might have beendetermined to be 54%. Thus, it may be determined that $540 would bededucted from the price paid by a consumer for this vehicle, on average.Thus, this amount may also be added into the price of the transaction toarrive at a figure of $17274 as the transaction price without incentivesfor this transaction. Similar calculations may be performed for theother historical transactions in the specified vehicle's bin.

After steady state prices are determined, at step 570 the average dealercost corresponding to the specified vehicle may be determined using thehistorical transaction data in the bin (including the adjustedtransaction prices corresponding to the historical transactions) and thedealer cost model corresponding to the manufacturer of the specifiedvehicle. The price ratio corresponding to the specified vehicle may thenbe determined using the price ratio equation by plugging in valuescorresponding to the specified vehicle into the bin-level variables ofthe price ratio equation and solving. Using the determined price ratio,the average price paid (mean) for the specified vehicle may bedetermined using the equation Avg Price Paid=PriceRatio*DealerCost.

In one embodiment, at this point, if there are currently any incentivesavailable for the specified vehicle the adjusted transaction prices forthe historical transactions and the average price paid can be scaledbased on these incentives. In particular, utilizing a presentedinterface a user may have selected on or more consumer incentivesoffered in conjunction with specified vehicle configuration. Thesespecified consumer incentives may be utilized to adjust the transactionprice. More specifically, these transaction prices may be furtheradjusted based on a process similar to that used in determining steadystate pricing, which accounts for current incentives. Thus, the equationmay be Price (transaction price)=Price_ss (steady state)−I_(c)−λI_(d),where I_(c)=consumer incentives applied to the transaction, I_(d)=dealerincentives available for the transaction, and λ=dealer incentivespassthrough rate or Avg Price Paid_(final)=Avg PricePaid_(computed)−λI_(d). In this way, as incentives may fluctuate basedon geography, it is possible to display prices tailored to the user'slocal market prices as a way for the user to gauge how much room theyhave for negotiations, rather than displaying a full range of pricesthat has been unduly influenced by changes in available incentives. Notethat, in some embodiments, it may be also be desirable to adjust thedetermined average dealer cost downward by the full amount of theconsumer and dealer incentives at this time.

Once average price paid is determined for the specified vehicle, at step580 one or more price ranges may be determined. These price ranges maybe determined using the standard deviation determined from thehistorical transaction data, including the adjusted transaction prices,of the bin. For example, the top end of a “good” price range may becalculated as: Good=Avg Price Paid+0.15*stddev, the top end of a “great”price range can be determined as Great=Avg Price Paid−0.50*stddev, whilean “Overpriced” price range may be defined as any price above the “good”transaction price. Alternatively, the “good” price range may extend fromthe minimum of the median transaction price and the mean transactionprice to one-half standard deviation below the mean price as determinedbased on the historical transaction data of the bin, including theadjusted transaction prices corresponding to the specified vehicle. Itwill be noted that any other fraction of standard deviation may be usedto determine “good,” “great,” “overpriced” price ranges, or some othermethod entirely may be used.

A display may then be generated at step 590. In one embodiment, thisdisplay may be generated by fitting a Gaussian curve to the distributionof the adjusted transaction prices corresponding to the historicalpricing data of the bin associated with the specified vehicle andformatting the results for visual display. In addition, the visualdisplay may have one or more indicators displayed relative to thedisplayed pricing curve which indicate where one or more pricing rangesor price points are located.

It may be helpful here to illustrate an example in conjunction with aspecific vehicle. To continue with the above example, for themanufacturer Ford, suppose that the specified vehicle is a 2009 FordEconoline Cargo Van, E-150 Commercial with no options. In this case, thedealer cost model for Ford may specify that the dealer cost iscalculated off of the base MSRP minus freight charge. From data obtainedfrom a data source it can be determined that MSRP for this vehicle is$26,880 and freight charges are $980. Accordingly, holdback for thespecified vehicle is computed as Holdback=α_(C)+α₁(MSRP−Freight), whereα_(C)=0, α₁=0.03 (from the above dealer model corresponding to Ford).Thus, holdback=0.03*(26880−980)=777. Base invoice price can bedetermined to be $23,033 from obtained data, thus Factory Invoice=BaseInvoice+Ad fees+Freight=$23,033+$428+$980=$24,441 and Dealercost=Factory Invoice−Holdback=$24,441−$777=$23,664

Using prices from historical transaction data corresponding to the 2009Ford Econoline Cargo Van, E-150 Commercial with no options (the bin) anaverage price ratio may be determined. As mentioned earlier, theseprices may be adjusted for incentives.

Assume now that PriceRatio=f(χ)=Σ

₌

^(n)Σ_(k=)

^(m)(β

X

X_(bk))=1.046 for the 2009 Ford Econoline Cargo Van, E-150 Commercial,in this case Average Price Paid=DealerCost*1.046=$24,752. At this point,if there were any currently available incentives available for the 2009Ford Econoline Cargo Van, E-150 Commercial with no options adjustmentscan be made. In this example, there may not be. However, if there were,for example, $1,500 in consumer incentives and $500 in dealerincentives, the prices can be rescaled based on these incentives. Thus,in this scenario, average price paidadjusted=$24,752−$1,500−0.30(500)=$23,102, presuming this vehicle hashistorically had a 30% passthrough rate.

Turning briefly to FIGS. 7A and 7B one example of interfaces which maybe used by a vehicle data system to present such pricing information toa user are depicted.

In particular, FIG. 7A is an interface presenting the determined ActualDealer Cost, Factory Invoice, Average Paid (average price paid) andsticker price for a 2009 Ford Econoline Cargo Van, E-150 Commercial on anational level while FIG. 7B is an interface presenting identical dataat a local level.

Accordingly, for this particular example, the case of the 2009 FordEconoline Cargo Van, E-150 Commercial, the breakout of prices is thatthe top end of the “good” price range can now calculated as: “good” and“great” ranges are computed as follows: “good” extends from themin(median(P), mean(P)) down to one-half standard deviation below themean price over recent transactions. The “great” price range extendsfrom one-half standard deviation below the mean and lower. So, for theEconoline in this example, with no options: Average price=$24,752nationally, the upper end of the “good” price range=$24,700 (the medianof the data in this example) and the upper end of the “great” pricerange=24752−0.5*σ_(b)=24752−0.5(828)=$24,338.

A Gaussian curve can then be fit parametrically to the actual pricedistributions of the historical transaction data corresponding to the2009 Ford Econoline Cargo Van, E-150 Commercial to produce embodimentsof the visual display depicted in FIGS. 8A and 8B. Here, FIG. 8A is aninterface visually presenting the national level price distribution forthe 2009 Ford Econoline Cargo Van, E-150 Commercial after the Gaussiancurve fitting process where the price points “Actual Dealer Cost”,“Factory Invoice”, “Average Paid” (average price paid) and “StickerPrice” for a 2009 Ford Econoline Cargo Van, E-150 Commercial areindicated relative to the price curve depicting the pricingdistributions for the 2009 Ford Econoline Cargo Van, E-150 Commercial.Additionally, the “good” and “great,” and “overpriced” price ranges areindicated in relation to the presented pricing curve. FIG. 8B presents asimilar pricing curve related to local level data for the same vehicle.

It may be illustrative of the power and efficacy of embodiments of thepresent invention to discuss in more detail embodiments of variousinterfaces which may be employed in conjunction with embodiments of avehicle data system. Referring to FIGS. 9A-9D embodiments of interfacesfor obtaining vehicle configuration information and the presentation ofpricing data. In particular, referring first to FIG. 9A, at this point auser may have selected a 2009 Dodge Charger 4dr Sedan R/T AWD and ispresented interface 1500 to allow a user to specify his desired vehicleconfiguration in more detail through the selection of one or moreattributes. Notice that interface 1500 presents the user with both theinvoice and sticker prices associated with each of the attribute whichthe user may select.

Once the user has selected any of the desired attributes he may bepresented with an embodiment of interface 1510 such as that depicted inFIG. 9B, where the user may be allowed to select one or more currentlyavailable incentives associated with selected vehicle configuration (inthis case a 2009 Dodge Charger 4dr Sedan R/T AWD). In certainembodiment, the vehicle data system may access any currently availableincentives corresponding to the user's specified vehicle configurationand present interface 1510 utilizing the obtained currently availableincentives to allow a user to select zero or more of the availableincentives. Notice here that one of the presented incentives comprises a$4500 cash amount. Suppose for purposes of the remainder of this examplethat the user selects this $4500 incentive.

Moving now to FIG. 9C, an embodiment of an interface presenting pricinginformation associated with selected vehicle configuration (in this casea 2009 Dodge Charger 4dr Sedan R/T AWD) is depicted. Notice here thatthe interface specifically notes that the prices shown include the $4500in consumer incentives selected by the user with respect to interface1510 in this example.

Notice now, with respect to FIG. 9D one embodiment of an interfacepresenting the determined Actual Dealer Cost, Factory Invoice, AveragePaid (average price paid) and sticker price for a 2009 Dodge Charger 4drSedan R/T AWD on a local level is presented. Notice here with respect tothis interface, that the user is presented not only with specificpricing points, but in addition, data on how these pricing points weredetermined, including how the $4500 consumer incentive selected by theuser was applied to determine the dealer cost and the average pricepaid. By understanding incentive information and how such incentiveinformation and other data may be pertain to the dealer cost and theaverage price paid by others, a user may better be able understand andevaluate prices and pricing data with respect to their desired vehicleconfiguration.

It may be additionally useful here to present a graphical depiction ofthe creation data which may be presented through such interfaces. Asdiscussed above, a bin for a specific vehicle configuration may comprisea set of historical transaction data. From this historical transactiondata, a histogram of dealer margin (transaction price−dealer cost), aswell as other relevant statistics such as mean and standard deviationmay be calculated. For example, FIG. 10A graphically depicts anational-level histogram for a Honda Accord corresponding to a bin witha large sample set of 6003 transactions and 18 buckets (the first bucketcomprising any transaction less than 2 standard deviations from themean, 16 buckets of 0.25 standard deviations, and the last bucketcomprising any transactions greater than 2 standard deviations from themean). FIG. 10B graphically depicts another example of a histogram for aHonda Accord.

FIG. 11 depicts a conversion of the histogram of FIG. 10A into a graph.FIG. 12 graphically depicts the overlaying of the histogram curve asdepicted in FIG. 11 with a normalized curve by aligning the means of thehistogram and the normal curve and the values for the X-axis. Once thereal curve is abstracted from a simplified normal distribution,recommended pricing ranges can then be overlaid on top of the normalcurve to capture some of the complexity of the actual curve.

FIG. 13 graphically depicts determined “good” and “great” price rangesbased on margin ranges determined based on the percentile of people thatpurchased the car at below that price. One algorithm could be: that thetop of the range of a side of the “good” price range=MIN (50thpercentile transaction margin, average margin); the lower end of the“good” range/upper end of the “great” range would be 30th percentiletransaction point if less than 20% of the transactions are negativemargin or 32.5th percentile transaction point if greater than 20% of thetransaction are negative margin; and the lower end of “great” pricerange would be the 10th percentile transaction point if less than 20% ofthe transactions are below Dealer Cost (have a negative margin) or the15th percentile transaction point if less than 20% of the transactionare negative margin. The entire data range could be utilized fordisplayed, or the range of the data may be clipped at some point of theactual data to simplify the curve. In the example depicted in FIG. 13,the data set has been clipped at the bottom of the “great” range 1302.

Once a dealer cost has been established for the specified vehicle, thedealer cost is added to each bucket along the X-axis of the marginhistogram for this location and vehicle specification, translating themargin curve into a price curve as graphically depicted in FIG. 14. Theprice histogram is then overlaid with the determined “good”/“great”price ranges (which may also scaled by adding the dealer cost) as wellas other pricing points of interest such as Dealer Cost, FactoryInvoice, and MSRP. This enhanced histogram may be presented to user in avariety of formats, for example, the histogram may be displayed as asimplified curve as depicted in FIG. 15; as a bar chart as depicted inFIG. 16; as actual data as depicted in FIG. 17; or as historical trenddata as in depicted in FIG. 18.

As mentioned above, to determine accurate pricing information for aspecified vehicle, it is important to have accurate cost informationassociated with the historical transaction data associated with thatvehicle. Thus, in many cases when obtaining historical transaction datafrom a data source it may be desired to check a dealer cost provided inconjunction with a historical transaction or to determine a dealer costto associate with the historical transaction. As dealer cost models havebeen constructed for each manufacturer (see step 520) it may be possibleto leverage these dealer cost models to accurately construct dealer costfor one or more historical transactions and check a provided dealer costor associate the determine dealer cost with a historical transaction.

FIG. 19 depicts one embodiment of a method for determining an accuratedealer cost for historical transactions. Initially, at step 910historical transactions of obtained historical data which have accuratetrim mapping may be identified. In most cases, the vehicle associatedwith a historical transactions may be mapped to a particular trim basedon the vehicle identification number (VIN) associated with thehistorical transaction. However, often a 1 to 1 VIN mapping cannot becompleted as all information necessary to perform the mapping might notbe included in the VIN. In other words, a particular VIN may correspondto many trim levels for a vehicle. In these cases data providers mayprovide a one-to-many mapping and provide multiple trims associated witha single historical transaction. This presents a problem, as an actualsales transaction may then have multiple historical transactions in thehistorical transaction data, each historical transaction associated witha different trim, only one of which is actually correct. Given thatthere is often no way of identifying which of these historicaltransactions is correct, an appropriate modeling approach is to eitherweight these transactions differently or exclude these potentialmismapped transactions from the model-building dataset. Thus, in oneembodiment, after identifying these potential mismapped transactions byfor example, determining if there are multiple historical transactionsassociated with a single VIN, the identified historical transactions maybe excluded from the historical data set (for purposes of this method).

Within the remaining historical transactions, then, those historicaltransactions with accurate information may be identified at step 920. Asdiscussed before, the invoice and dealer cost fields of historicaltransaction data may be inaccurate. As one objective of thedetermination of dealer cost is accuracy it is important that dealercost be determined only for those historical transactions where it canbe determined with relative accuracy. As the presence of accurate triminformation or option information may be leveraged to determine dealercost, it may be desired to further refine the historical transaction todetermine those historical transactions with accurate trim mapping oridentifiable options information.

Now that a set of historical transactions with accurate trim mapping andidentifiable option information has been obtained, an MSRP may bedetermined for each of these historical transactions at step 930. Again,given that the data associated with a historical transaction may beunreliable and that alignment with configuration data (for example,dealer cost models or price ratio equation) is important, it may bedesirable to determine certain data associated with the historicaltransaction data utilizing known data. Thus, even if an MSRP wasprovided or otherwise obtained, an MSRP for the historical transactionmay be determined. First, a base MSRP may be determined. Specifically,with year, make, model, and trim identified specifically from the VIN, abase MSRP may be determined based on data provided by a data source.Then, using additional options identified by the historical transactiondata the manufacturer suggested retail pricing for these options can beadded to the base MSRP to form the transaction MSRP. More specifically,with each historical transaction there may be a field that includes aset of options codes indicating which options were factory-installed onthe particular vehicle corresponding to that historical transaction.Parsing this information, the options codes can be used in conjunctionwith option pricing information obtained from a data source to identifya MSRP for each factory-installed option. Summing each of themanufacturer prices for the options the Total Options MSRP can begenerated and added to the base MSRP to generate the transaction MSRPfor that particular historical transaction (Transaction MSRP=BaseMSRP+Total Options MSRP).

After the transaction MSRP is determined for the historicaltransactions, invoice pricing for each of the historical transactionsmay be determined at step 940. The transaction invoice may be generatedsimilarly to the transaction MSRP. First, a base Invoice price may bedetermined. Specifically, with year, make, model, and trim identifiedspecifically from the VIN, a base Invoice price may be determined basedon data provided by a data source. Then, using additional optionsidentified by the historical transaction data, pricing for these optionscan be added to the base Invoice price to form the transaction Invoiceprice. More specifically, with each historical transaction there may bea field that includes a set of options codes indicating which optionswere factory-installed on the particular vehicle corresponding to thathistorical transaction. Parsing this information, the options codes canbe used in conjunction with option pricing information to assign anoptions Invoice price for each factory-installed option. Summing each ofthe option Invoice prices for the options the Total Options Invoiceprice can be generated and added to the base Invoice price to generatethe transaction Invoice price for that particular historical transaction(Transaction Invoice=Base Invoice+Total Options Invoice).

Using the determined MSRPs and Invoice prices, a dealer cost for eachhistorical transaction may be determined at step 950. This dealer costmay be determined by algorithmically determined utilizing the dealercost model associated with the manufacturer of the vehicle associatedwith a historical transaction. More specifically, each make of vehicle(manufacturer) has an associated holdback equation as discussed above.For a particular historical transaction, using the holdback equationcorresponding to the make of the vehicle to which the historicaltransaction pertains, the base invoice price, base MSRP, transactioninvoice price and transaction MSRP determined for that historicaltransaction, and freight fees (which may be determined based oninformation obtained from a data source similarly to the determinationof base invoice and base MSRP), the holdback equation can be applied todetermine dealer cost (dealercost=invoice−holdback).

While more details of embodiments of a vehicle data system have beendiscussed above, it may be useful here to go over at a high level howembodiments of such a vehicle data system may be utilized. Accordingly,FIG. 20 depicts a flow diagram 2000 for one embodiment of the use of anembodiment of a vehicle data system.

At step 2015 an interface may be presented to a user, where theinterface comprises a configurator 2017 that enables a user to specify avehicle configuration. The configuration may allow a user to select oneor more consumer incentives 2025 that are offered for the specifiedvehicle configuration. Using the presented interface a user may indicatethat he wishes to obtain pricing data for the specified vehicleconfiguration (for example through interacting with the interface with a“click” 2027 of the mouse).

Based on the values of these attributes of the specified vehicleconfiguration, the specified vehicle's bin may be identified. In oneembodiment, a bin for a vehicle can be is defined as historical dataassociated with at least one of a group of vehicles that have the sameyear, make, model and body type for which there is historicaltransactions data within a certain time period (for example, the pastfour weeks or some other time period).

Using the pricing information associated with the historicaltransactions in the bin 2035 corresponding to the specified vehicle, anaverage price and average cost for the specified vehicle may be computedusing the historical transaction data associated with the bin of thespecified vehicle and the dealer cost model 2037 corresponding to themanufacturer of the specified vehicle. The price ratio corresponding tothe specified vehicle may then be determined using the price ratio model2045 by plugging in values corresponding to the specified vehicle intothe bin-level variables of the price ratio model and solving. Using thedetermined price ratio, the average price paid for the specified vehiclemay be determined along with one or more price ranges. A display maythen be generated at step 2047 to display determined pricing data to auser via user interface 2015. The display may present such informationas pricing distributions, price ranges, certain price points, etc.

Thus, using the historical transactions in the bin for the specifiedvehicle, desired pricing information may be obtained. In many cases,however, there may be fewer historical transactions in a bin for aspecified vehicle than is desired to generate reliable or accuratepredictions (for example, in the cases of a model which is relativelynew to the market or is an exotic of which few models are sold). Inthese instances, then, to increase the accuracy of determined pricingdata, it may be useful to apply different or additional price ratiomodels that leverage incremental data.

Embodiments of the present invention may therefore determine a set ofprice models to utilize in various conditions and utilize appropriatemodels in the cases where such conditions are extant. In particular, inone embodiment, one or more price models may be generated for use in thecase where fewer than a desired number of historical transactions arepresent in the bin of a specified vehicle (for example, fewer thanaround 20 historical transactions, etc.). Even more specifically, ofthese price models, one or more price models may be generated for caseswhere there are fewer than a certain number of list prices for thespecified vehicle available (for example, 3 or fewer, etc.) and caseswhere there are a certain number of list prices (or more) available.Certain of these price models may also pertain to new models (forexample, less than 12 weeks on the market), where certain of these pricemodels may be determined for cases where there is historical transactiondata for a similar make and model from a past year and other pricemodels determined for new models in cases where there is no suchhistorical transaction data.

Accordingly, in one embodiment, certain price models may be determinedand included in models 128 for use by a vehicle data system in determinepricing data to present to a user. Models 128 may therefore include, inaddition to a general price model discussed above, the following pricemodels:

-   -   a price model for instances in which there are less than a        threshold number of historical transactions available and for        which there is less than a threshold number of list prices        available (referred to as No_Info model);    -   a price model for instances in which there are less than a        threshold number of list prices available and for which there is        historical data available (referred to as New_Trim_Prior_No_List        model);    -   a price model for instances in which the specified vehicle is        not a new model, there is less than a threshold number of        historical transactions available and for which there is a        threshold number of list prices available (referred to as        Low-Volume model);    -   a price model for instances in which in which the specified        vehicle is not a new model and there is historical transaction        data available for a comparable make or model from a previous        year (referred to as New_Trim_Prior model); and    -   a price model for instances in which the specified vehicle is a        new model and there is no historical transaction data for a        comparable make or model for a previous year available (referred        to as New_Trim_No_Prior model).

Each of these price models may be generated utilizing globalmultivariable regression and historical transaction data. It may beuseful here to go into more details about how each of these pricemodels, also referred to as data scarcity models, is created. Referringto FIG. 21, an embodiment of method 2100 for the creation of aNew_Trim_Prior price model is depicted. At step 2110, initial historicaldata to utilize in the creation of the price model may be obtained. Thisinitial historical transaction data may comprise historical transactiondata pertaining to transactions which occurred in the last year forvehicle models or trims which were new at the time of the transaction(in one embodiment, if the model had been on the market for less than 12weeks). As an example, new trim, listing prices, and prior bin pricinginformation may be available.

Using this initial historical transaction data, at step 2120, new trimbinning may be determined. Specifically, bin_prior bins can bedetermined for the vehicles corresponding to each of the initialhistorical transactions. These bin_prior bins may include historicaltransactions which correspond to a vehicle of the same make, model andbody type (e.g. coupe, sedan, hatchback, convertible, etc.), of theprevious year, which was sold the same number of weeks after its releasedata (referred to as Weeks_Since). For instance, if one or more initialhistorical transactions correspond to a 2010 Kia Rio Coupe in its 6thweek in market, a bin_prior comprising historical transactions for a2009 Kia Rio Coupe in its 6th week on the market from the prior year maybe determined. It will be noted that though both vehicles may be intheir 6^(th) week on the market, the actual dates may be dissimilar to agreater or lesser degree as the actual release data for vehicles mayvary from year to year.

At step 2130, using each bin_prior an average 4-week price ratio may begenerated.

Next, at step 2140, bin_priortrim bins may be determined. Abin_priortrim is similar to a bin_prior bin except that it is generatedusing a trim level instead of a body type. These bin_priortrim bins mayinclude historical transactions which correspond to a vehicle of thesame make, model and trim (e.g. LX, DX, same values for a set of vehicleattributes, etc.) of the previous year, which was sold the same numberof weeks after its release data (referred to as Weeks_Since).

At step 2150, an invoice price ratio for each of the vehicles (a ratioof the invoice prices of these vehicles year over year) can bedetermined utilizing the bin_priortrim for the vehicle. Invoice pricesof these vehicles are compared year over year.

At step 2160, the most recent listing price data for each vehicle year,make, model and trim for which a bin_priortrim was generated is utilizedto produce an average listing price for this trim. Listing priceinformation may be information from dealers that list prices on theirwebsites or in print as their no-haggle price that they are willing tosell the vehicle for. The up-front pricing may be determined from one ormore data sources.

At step 2170 a New_Trim_Prior model is then constructed utilizing thedata determined at steps 2110-2160. This New_Trim_Prior model may be ofthe form:

Price Ratio=f(χ)=Σ

₌

^(n)Σ_(k=)

^(m)(β

X

), where X

's are variables, and β

's are coefficients.

Specifically, one embodiment of this equation might be the following:PriceRatio=a0+a1*PRlisting+a2*optionsmsrp_ratio+a3*invoicetrimyoy+a4*priordiff4+a5*f(daysinmarket)+a6*basemsrp,where ai=coefficients, PRlisting is the most up-to-date listing priceratio for this particular trim, optionsmsrp_ratio is the ratio of theoptioned up vehicle msrp to the base msrp, Invoicetrimyoy is the ratioof year-over-year invoice prices for this trim, priordiff4 is thedifference in the prior year price ratio and the current listing priceratio, f(daysinmarket) is any suitable transformation of the # of daysthe model has been on the marketplace and basemsrp is the base modelmsrp for the vehicle. In this case, a straight linear construction ofthe variable is performed. New_Trim_Prior model is further describedbelow with reference to FIG. 25.

In one embodiment, a New_Trim_Prior_No_List model may be constructed ina similar manner as described above with the exception of step 2160. Inthis case, prior year data, and not the most recent listing price data,is used to determine the average listing price for the new trim. This ispossible because, while vehicle prices may change quite a bit from yearto year, pricing variance for a given trim is relatively stable.

Turning now to FIG. 22, an embodiment of method 2200 for the creation ofa New_Trim_No_Prior model price model is depicted. At step 2210, initialhistorical data to utilize in the creation of the price model may beobtained. This initial historical transaction data may comprisehistorical transaction data pertaining to transactions which occurred inthe last year for vehicle models or trims which were new at the time ofthe transaction (in one embodiment, if the model had been on the marketfor less than 12 weeks).

At step 2220, the most recent listing price data for each vehicle year,make, model and trim for which an initial historical transaction existsmay be obtained to produce an average listing price for this trim. Atstep 2230, a New_Trim_No_Prior price model is then determined utilizingthe data determined at steps 2210 and 2220 in the same manner asdiscussed above with respect to step 2170.

Moving on, FIG. 23 depicts an embodiment of method 2300 for the creationof a No_Info price model. At step 2310 initial historical data toutilize in the creation of the price model may be obtained. This initialhistorical transaction data may comprise all historical transaction datapertaining to transactions which occurred in the last year.

At step 2320, the most recent listing price data for each vehicle year,make, model and trim for which an initial historical transaction existsmay be obtained to produce an average listing price for this trim. Atstep 2330, a No_Info price model is then determined utilizing the datadetermined at steps 2310 and 2320 in the same manner as discussed abovewith respect to step 170.

FIG. 24 depicts an embodiment of method 2400 for the creation of aLow-Volume price model. At step 2410 initial historical data to utilizein the creation of the price model may obtained. This initial historicaltransaction data may comprise all historical transaction data pertainingto transactions which occurred in the last year.

At step 2420, the most recent listing price data for each vehicle year,make, model and trim for which an initial historical transaction existsmay be obtained to produce an average listing price for this trim. Atstep 2430, a Low-Volume price model is then determined utilizing thedata determined at steps 2410 and 2420 in the same manner as discussedabove with respect to step 2170.

Once these price models are generated, vehicle data system 120 mayutilize these price models to more accurately determine pricing dataassociated with specific vehicle configurations. FIG. 25 depicts oneembodiment of method 2500 for the operation of an embodiment of avehicle data system which employs embodiments of the models discussedabove.

At step 2508, a vehicle data system may operate as described above. Oncea specified vehicle configuration is received from a user through aninterface of the vehicle data system, it may be determined at step 2510if sufficient historical transaction data for the specified vehicleconfiguration exists. More particularly, a bin of historical transactiondata associated with the specified vehicle configuration for aparticular time period (for example 4 weeks) may be obtained and thegeneral price model applied to this historical transaction data. Ifthere are sufficient number of historical transactions (in oneembodiment, 20 or more transactions) associated with the specifiedvehicle configuration, at step 2512 the general price ratio model may beutilized as described above and pricing data for the specified vehicleconfiguration may be determined using only the general price ratiomodel, including, for example, an average price paid at a nationallevel.

If, however, less than the threshold number of historical transactionsexists, other price models may be utilized in addition to the generalprice model. In this case, it may be determined at step 2520 if thereare a threshold (in one embodiment, three) number of list prices (whichmay in one embodiment be obtained from a list price provider such as ZAGor the like, other sources are possible as well). If there is less thanthe threshold number of list prices, it may be determined at step 2525whether the specified vehicle configuration (for example a particularmodel) is new (in one embodiment, about 12 weeks or less) on the marketbut has prior year data is available. If so, a New_Trim_Prior_No_Listmodel as described above may be utilized at step 2590. Otherwise, aNo_Info model as described above may be utilized at step 2580.

In one embodiment, to utilize a No_Info model, historical transactiondata corresponding to the year, make, model and trim of the specifiedvehicle configuration may be obtained. This will be used to constructthe general price model. Additionally, a separate No_Info model equationmay be constructed on the full set of all transactions data as afunction of incentives information, options data or other vehicleconfiguration information. The No_Info price model utilized by thevehicle data system may then be applied using this data to generate anaverage price for the specified vehicle. A weighting factor can then beapplied to combine the results from the two models (the general pricemodel and the No_Info price model) to generate an average price paid.

Utilizing the determined average price, a pricing distribution may begenerated. Here, a variance may be estimated: σ=σ_(tot) _(avg) whereσ_(tot) _(avg) =avg standard deviation on the data utilized and σ is thepredicted standard deviation of price ratios for specified vehicleconfiguration at the trim level. From this data, “Good” and “Great”price ranges can be determined.

Returning to step 2520, if at least a threshold number of list pricesexist, it can be determined at step 2530 if the specified vehicleconfiguration is a new model (where, in one embodiment, a new model maybe any model that has been on the market for about twelve weeks orless). If it is not a new model then, at step 2570, in addition to thegeneral price ratio model a Low-Volume model may be utilized.

In one embodiment, to utilize a Low-Volume model, historical transactiondata corresponding to the year, make, model and trim of the specifiedvehicle configuration may be obtained along with the most recent listingprice data for the specified vehicle configuration. From the listingprice data an average listing price may be determined for the specifiedvehicle configuration at a trim level. The Low-Volume price modelutilized by the vehicle data system may then be applied using this datato generate an average price for the specified vehicle. A weightingfactor can then be applied to combine the results from the two models(the general price model and the Low-Volume price model) to generate anaverage price paid.

Utilizing the determined average price, a pricing distribution may begenerated. Here, historical transaction data for the year, make, modeland trim level of the specified vehicle configuration and the mostrecent listing price data for the specified vehicle configuration may beutilized to produce an average standard deviation of listing priceoffsets for this trim. Next, construct σ=α₀+α₁*(σ_(listing)/P_(invoice))where σ_(listing)=standard deviation of listing prices for thehistorical transaction data for the year, make, model and trim level ofthe specified vehicle configuration. Here, σ is the predicted standarddeviation of price ratios for this specified vehicle configuration atthe trim level. From this data, “Good” and “Great” price ranges can bedetermined.

Returning to step 2530, if the specified vehicle configuration is a newmodel, it can be determined at step 2540 if there is historicaltransaction data for the same make and model as the specified vehicleconfiguration from a prior year. If there is such historical transactiondata, a New_Trim_Prior model may be utilized at step 2550 in addition tothe general price ratio model. Specifically, historical transaction datacorresponding to the specified vehicle configuration may be obtained.Then, a first average price may be determined by applying the generalprice model to the obtained historical transaction data, as discussedabove.

Next, a second average price may be determined for the specified vehicleconfiguration using the New_Trim_Prior model. In particular, a bin_priorbin can be determined for the specified vehicle configuration. Thisbin_prior bin may include historical transactions which correspond to avehicle of the same make, model and body type of the previous year,which was sold the same number of weeks after its release data (referredto as Weeks_Since). Using this bin_prior an average 4-week price ratiomay be generated.

A bin_priortrim bin may then be determined for the specified vehicleconfiguration. This bin_priortrim bin may include historicaltransactions which correspond to a vehicle of the same make, model andtrim of the previous year, which was sold the same number of weeks afterits release data. Using the bin_priortrim a year over year invoice priceratio for the specific vehicle configuration can be determined. Next themost recent listing price data for the specified vehicle configurationmay be obtained. The New_Trim_Prior price model utilized by the vehicledata system may then be applied to determined data to generate thesecond average price.

A weighting factor can then be applied to combine the results from thetwo models (the general price model and this New_Trim_Prior model) togenerate an average price paid. This weighting factor may be, forexample, as simple as a straight linear average: taking weighted averageof the result of general price model+result from the New_Trim_Priormodel. For example, if n=# transactions in bin for existing model, thenlet m=20−n. Here, W1=m/20. W2=n/20. MR_model_score=W1*MR_newtrim priormodel result+W2*general price model result. It will be apparent that theweighting factor could any desired less or more complicated function forcombining the results of the New_Trim_Prior and the results of thegeneral pricing model or to combine the results of the general pricingmodel with any other model as discussed herein.

Utilizing the determined average price, a pricing distribution may begenerated. Presuming the historical transaction prices are normallydistributed, a Gaussian distribution may be parameterized by computingthe 2nd moment. Using this sample variance, the Gaussian assumption, andthe sample mean (as computed above), the Gaussian curve and each desiredprice range (“Good,” “Great,” etc.) can be determined. Prior statisticalresearch has determined that while prices can change quite a bit yearover year, pricing variance for a given trim is relatively stable. Thus,in one embodiment a 4-week bin_prior bin can be determined for thespecified vehicle configuration. This bin_prior bin may includehistorical transactions which correspond to a vehicle of the same make,model and body type (e.g. coupe, sedan, hatchback, convertible, etc.),of the previous year, which was sold the equivalent number of days afterits release date as the specified vehicle configuration (or up to 28days prior).

Next, σ=σ_(4wk,priorbin) can be determined whereσ_(4wk,priorbin)=standard deviation of the 4 week bin_prior's priceratios from weeks_since=n−3 to n, with n=the number of weeks since thatvehicle was introduced to the market and σ is the predicted standarddeviation of price ratios for this new trim. Utilizing this data, “Good”and “Great” price ranges can be determined.

Returning to step 2540, if there is no historical transaction data forthe same make and model as the specified vehicle configuration from aprior year a New_Trim_No_Prior model may be utilized. In one embodiment,to utilize a New_Trim_No_Prior model, historical transaction datacorresponding to the year, make, model and trim of the specified vehicleconfiguration may be obtained along with the most recent listing pricedata for the specified vehicle configuration. From the listing pricedata, an average listing price may be determined for the specifiedvehicle configuration at a trim level. The New_Trim_No_Prior price modelutilized by the vehicle data system may then be applied using this datato generate an average price for the specified vehicle.

A weighting factor can then be applied to combine the results from thetwo models (the general price model and the low-volume price model) togenerate an average price paid. Utilizing the determined average price,a pricing distribution may be generated. Here, historical transactiondata for the year, make, model and trim level of the specified vehicleconfiguration and the most recent listing price data for the specifiedvehicle configuration may be utilized to produce an average standarddeviation of listing price offsets for this trim.

Next, construct σ=α₀+α₁*(σ_(listing)/P_(invoice)) whereσ_(listing)=standard deviation of listing prices for the historicaltransaction data for the year, make, model and trim level of thespecified vehicle configuration and σ is the predicted standarddeviation of price ratios for this specified vehicle configuration atthe trim level. From this data, “Good” and “Great” price ranges can bedetermined.

In the foregoing specification, the invention has been described withreference to specific embodiments. However, one of ordinary skill in theart appreciates that various modifications and changes can be madewithout departing from the scope of the invention as set forth in theclaims below. Accordingly, the specification and figures are to beregarded in an illustrative rather than a restrictive sense, and allsuch modifications are intended to be included within the scope ofinvention.

Benefits, other advantages, and solutions to problems have beendescribed above with regard to specific embodiments. However, thebenefits, advantages, solutions to problems, and any component(s) thatmay cause any benefit, advantage, or solution to occur or become morepronounced are not to be construed as a critical, required, or essentialfeature or component of any or all the claims.

1. A method, comprising: at a vehicle data system running on one or moreserver machines, receiving a specified vehicle configuration;determining pricing data corresponding to the specified vehicleconfiguration, wherein determining pricing data comprises applying twoor more models to a set of historical transaction data, wherein the setof historical transaction data comprises data on a number oftransactions associated with vehicles of the specified vehicleconfiguration, wherein the number of transactions is less than a firstthreshold, wherein the two or more models include a general price modeland a data scarcity model; and generating an interface based on thepricing data.
 2. The method according to claim 2, further comprising:using the general price model to generate a first average price for thespecified vehicle configuration; using the data scarcity model togenerate a second average price for the specified vehicle configuration;combining results from the general price model and the data scarcitymodel to generate a third average price; and utilizing the third averageprice to generate a pricing distribution for the specified vehicleconfiguration.
 3. The method according to claim 1, wherein the set ofhistorical transaction data comprises data on a number of list pricesassociated with vehicles of the specified vehicle configuration andwherein the number of list prices meets a second threshold, furthercomprising: determining whether the specified vehicle configuration isassociated with a new vehicle model.
 4. The method according to claim 3,wherein the specified vehicle configuration is associated with a newvehicle model, wherein the set of historical transaction data comprisesdata on the specified vehicle configuration from a prior year, andwherein the data scarcity model is a New_Trim_Prior model.
 5. The methodaccording to claim 3, wherein the specified vehicle configuration isassociated with a new vehicle model, wherein data on the specifiedvehicle configuration from a prior year is not available, and whereinthe data scarcity model is a New_Trim_No_Prior model.
 6. The methodaccording to claim 3, wherein the specified vehicle configuration is notassociated with a new vehicle model, and wherein the data scarcity modelis a Low-Volume model.
 7. The method according to claim 1, wherein theset of historical transaction data comprises data on a number of listprices associated with vehicles of the specified vehicle configuration,wherein the number of list prices is less than a second threshold,wherein the specified vehicle configuration is not associated with a newvehicle model, wherein data on the specified vehicle configuration froma prior year is not available, and wherein the data scarcity model is aNo_Info model.
 8. The method according to claim 1, wherein the set ofhistorical transaction data comprises data on a number of list pricesassociated with vehicles of the specified vehicle configuration, whereinthe number of list prices is less than a second threshold, wherein thespecified vehicle configuration is associated with a new vehicle model,wherein the set of historical transaction data comprises data on thespecified vehicle configuration from a prior year, and wherein the datascarcity model is a New_Trim_Prior_No_List model.
 9. A computer programproduct comprising at least one non-transitory computer readable mediumstoring instructions translatable by one or more processor to perform:determining pricing data corresponding to a specified vehicleconfiguration, wherein determining pricing data comprises applying twoor more models to a set of historical transaction data, wherein the setof historical transaction data comprises data on a number oftransactions associated with vehicles of the specified vehicleconfiguration, wherein the number of transactions is less than a firstthreshold, wherein the two or more models include a general price modeland a data scarcity model; and generating an interface based on thepricing data.
 10. The computer program product of claim 9, wherein theinstructions are further translatable by the one or more processor toperform: using the general price model to generate a first average pricefor the specified vehicle configuration; using the data scarcity modelto generate a second average price for the specified vehicleconfiguration; combining results from the general price model and thedata scarcity model to generate a third average price; and utilizing thethird average price to generate a pricing distribution for the specifiedvehicle configuration.
 11. The computer program product of claim 9,wherein the specified vehicle configuration is associated with a newvehicle model, wherein the set of historical transaction data comprisesdata on the specified vehicle configuration from a prior year, andwherein the data scarcity model is a New_Trim_Prior model.
 12. Thecomputer program product of claim 9, wherein the specified vehicleconfiguration is associated with a new vehicle model, wherein data onthe specified vehicle configuration from a prior year is not available,and wherein the data scarcity model is a New_Trim_No_Prior model. 13.The computer program product of claim 9, wherein the specified vehicleconfiguration is not associated with a new vehicle model, and whereinthe data scarcity model is a Low-Volume model.
 14. The computer programproduct of claim 9, wherein the set of historical transaction datacomprises data on a number of list prices associated with vehicles ofthe specified vehicle configuration, wherein the number of list pricesis less than a second threshold, wherein the specified vehicleconfiguration is not associated with a new vehicle model, wherein dataon the specified vehicle configuration from a prior year is notavailable, and wherein the data scarcity model is a No_Info model. 15.The computer program product of claim 9, wherein the set of historicaltransaction data comprises data on a number of list prices associatedwith vehicles of the specified vehicle configuration, wherein the numberof list prices is less than a second threshold, wherein the specifiedvehicle configuration is associated with a new vehicle model, whereinthe set of historical transaction data comprises data on the specifiedvehicle configuration from a prior year, and wherein the data scarcitymodel is a New_Trim_Prior_No_List model.
 16. A system, comprising: oneor more computing devices; and a vehicle data system coupled to the oneor more computer devices over a network, the vehicle data systemcomprising: a processing module, the processing module configured to:determine pricing data corresponding to a specified vehicleconfiguration, wherein determining pricing data comprises applying twoor more models to a set of historical transaction data, wherein the setof historical transaction data comprises data on a number oftransactions associated with vehicles of the specified vehicleconfiguration, wherein the number of transactions is less than a firstthreshold, wherein the two or more models include a general price modeland a data scarcity model; and generate an interface based on thepricing data.
 17. The system of claim 16, wherein the processing moduleis further configured to: generate a first average price for thespecified vehicle configuration using the general price model; generatea second average price for the specified vehicle configuration using thedata scarcity model; combining results from the general price model andthe data scarcity model to generate a third average price; and utilizingthe third average price to generate a pricing distribution for thespecified vehicle configuration.
 18. The system of claim 16, wherein thespecified vehicle configuration is associated with a new vehicle model,wherein the set of historical transaction data comprises data on thespecified vehicle configuration from a prior year, and wherein the datascarcity model is a New_Trim_Prior model.
 19. The system of claim 16,wherein the specified vehicle configuration is associated with a newvehicle model, wherein data on the specified vehicle configuration froma prior year is not available, and wherein the data scarcity model is aNew_Trim_No_Prior model.
 20. The system of claim 16, wherein thespecified vehicle configuration is not associated with a new vehiclemodel, and wherein the data scarcity model is a Low-Volume model. 21.The system of claim 16, wherein the set of historical transaction datacomprises data on a number of list prices associated with vehicles ofthe specified vehicle configuration, wherein the number of list pricesis less than a second threshold, wherein the specified vehicleconfiguration is not associated with a new vehicle model, wherein dataon the specified vehicle configuration from a prior year is notavailable, and wherein the data scarcity model is a No_Info model. 22.The system of claim 16, wherein the set of historical transaction datacomprises data on a number of list prices associated with vehicles ofthe specified vehicle configuration, wherein the number of list pricesis less than a second threshold, wherein the specified vehicleconfiguration is associated with a new vehicle model, wherein the set ofhistorical transaction data comprises data on the specified vehicleconfiguration from a prior year, and wherein the data scarcity model isa New_Trim_Prior_No_List model.